30 research outputs found

    Large Language Models Based Automatic Synthesis of Software Specifications

    Full text link
    Software configurations play a crucial role in determining the behavior of software systems. In order to ensure safe and error-free operation, it is necessary to identify the correct configuration, along with their valid bounds and rules, which are commonly referred to as software specifications. As software systems grow in complexity and scale, the number of configurations and associated specifications required to ensure the correct operation can become large and prohibitively difficult to manipulate manually. Due to the fast pace of software development, it is often the case that correct software specifications are not thoroughly checked or validated within the software itself. Rather, they are frequently discussed and documented in a variety of external sources, including software manuals, code comments, and online discussion forums. Therefore, it is hard for the system administrator to know the correct specifications of configurations due to the lack of clarity, organization, and a centralized unified source to look at. To address this challenge, we propose SpecSyn a framework that leverages a state-of-the-art large language model to automatically synthesize software specifications from natural language sources. Our approach formulates software specification synthesis as a sequence-to-sequence learning problem and investigates the extraction of specifications from large contextual texts. This is the first work that uses a large language model for end-to-end specification synthesis from natural language texts. Empirical results demonstrate that our system outperforms prior the state-of-the-art specification synthesis tool by 21% in terms of F1 score and can find specifications from single as well as multiple sentences

    Natural Language is a Programming Language: Applying Natural Language Processing to Software Development

    Get PDF
    A powerful, but limited, way to view software is as source code alone. Treating a program as a sequence of instructions enables it to be formalized and makes it amenable to mathematical techniques such as abstract interpretation and model checking. A program consists of much more than a sequence of instructions. Developers make use of test cases, documentation, variable names, program structure, the version control repository, and more. I argue that it is time to take the blinders off of software analysis tools: tools should use all these artifacts to deduce more powerful and useful information about the program. Researchers are beginning to make progress towards this vision. This paper gives, as examples, four results that find bugs and generate code by applying natural language processing techniques to software artifacts. The four techniques use as input error messages, variable names, procedure documentation, and user questions. They use four different NLP techniques: document similarity, word semantics, parse trees, and neural networks. The initial results suggest that this is a promising avenue for future work

    Staccato: A Bug Finder for Dynamic Configuration Updates

    Get PDF

    Configurations everywhere: implications for testing and debugging in practice

    Full text link
    us.abb.com Many industrial systems are highly-configurable, complicat-ing the testing and debugging process. While researchers have developed techniques to statically extract, quantify and manipulate the valid system configurations, we conjecture that many of these techniques will fail in practice. In this paper we analyze a highly-configurable industrial applica-tion and two open source applications in order to quantify the true challenges that configurability creates for software testing and debugging. We find that (1) all three appli-cations consist of multiple programming languages, hence static analyses need to cross programming language barriers to work, (2) there are many access points and methods to modify configurations, implying that practitioners need con-figuration traceability and should gather and merge meta-data from more than one source and (3) the configuration state of an application on failure cannot be reliably deter-mined by reading persistent data; a runtime memory dump or other heuristics must be used for accurate debugging. We conclude with a roadmap and lessons learned to help prac-titioners better handle configurability now, and that may lead to new configuration-aware testing and debugging tech-niques in the future

    Diagnosing Software Configuration Errors via Static Analysis

    Get PDF
    Software misconfiguration is responsible for a substantial part of today's system failures, causing about one quarter of all user-reported issues. Identifying their root causes can be costly in terms of time and human resources. To reduce the effort, researchers from industry and academia have developed many techniques to assist software engineers in troubleshooting software configuration. Unfortunately, there exist some challenges in applying these techniques to diagnose software misconfigurations considering that data or operations they require are difficult to achieve in practice. For instance, some techniques rely on a data base of configuration data, which is often not publicly available for reasons of data privacy. Some techniques heavily rely on runtime information of a failure run, which requires to reproduce a configuration error and rerun misconfigured systems. Reproducing a configuration error is costly since misconfiguration is highly relevant to operating environment. Some other techniques need testing oracles, which challenges ordinary end users. This thesis explores techniques for diagnosing configuration errors which can be deployed in practice. We develop techniques for troubleshooting software configuration, which rely on static analysis of a software system and do not need to execute the application. The source code and configuration documents of a system required by the techniques are often available, especially for open source software programs. Our techniques can be deployed as third-party services. The first technique addresses configuration errors due to erroneous option values. Our technique analyzes software programs and infer whether there exists an possible execution path from where an option value is loaded to the code location where the failure becomes visible. Options whose values might flow into such a crashing site are considered possible root causes of the error. Finally, we compute the correlation degrees of these options with the error using stack traces information of the error and rank them. The top-ranked options are more likely to be the root cause of the error. Our evaluation shows the technique is highly effective in diagnosing the root causes of configuration errors. The second technique automatically extracts names of options read by a program and their read points in the source code. We first identify statements loading option values, then infer which options are read by each statement, and finally output a map of these options and their read points. With the map, we are able to detect options in the documents which are not read by the corresponding version of the program. This allows locating configuration errors due to inconsistencies between configuration documents and source code. Our evaluation shows that the technique can precisely identify option read points and infer option names, and discovers multiple previously unknown inconsistencies between documented options and source code

    Realistic tool-tissue interaction models for surgical simulation and planning

    Get PDF
    Surgical simulators present a safe and potentially effective method for surgical training, and can also be used in pre- and intra-operative surgical planning. Realistic modeling of medical interventions involving tool-tissue interactions has been considered to be a key requirement in the development of high-fidelity simulators and planners. The soft-tissue constitutive laws, organ geometry and boundary conditions imposed by the connective tissues surrounding the organ, and the shape of the surgical tool interacting with the organ are some of the factors that govern the accuracy of medical intervention planning.\ud \ud This thesis is divided into three parts. First, we compare the accuracy of linear and nonlinear constitutive laws for tissue. An important consequence of nonlinear models is the Poynting effect, in which shearing of tissue results in normal force; this effect is not seen in a linear elastic model. The magnitude of the normal force for myocardial tissue is shown to be larger than the human contact force discrimination threshold. Further, in order to investigate and quantify the role of the Poynting effect on material discrimination, we perform a multidimensional scaling study. Second, we consider the effects of organ geometry and boundary constraints in needle path planning. Using medical images and tissue mechanical properties, we develop a model of the prostate and surrounding organs. We show that, for needle procedures such as biopsy or brachytherapy, organ geometry and boundary constraints have more impact on target motion than tissue material parameters. Finally, we investigate the effects surgical tool shape on the accuracy of medical intervention planning. We consider the specific case of robotic needle steering, in which asymmetry of a bevel-tip needle results in the needle naturally bending when it is inserted into soft tissue. We present an analytical and finite element (FE) model for the loads developed at the bevel tip during needle-tissue interaction. The analytical model explains trends observed in the experiments. We incorporated physical parameters (rupture toughness and nonlinear material elasticity) into the FE model that included both contact and cohesive zone models to simulate tissue cleavage. The model shows that the tip forces are sensitive to the rupture toughness. In order to model the mechanics of deflection of the needle, we use an energy-based formulation that incorporates tissue-specific parameters such as rupture toughness, nonlinear material elasticity, and interaction stiffness, and needle geometric and material properties. Simulation results follow similar trends (deflection and radius of curvature) to those observed in macroscopic experimental studies of a robot-driven needle interacting with gels
    corecore