185,577 research outputs found

    Impact of Large Language Models on Generating Software Specifications

    Full text link
    Software specifications are essential for ensuring the reliability of software systems. Existing specification extraction approaches, however, suffer from limited generalizability and require manual efforts. We study the effectiveness of Large Language Models (LLMs) in generating software specifications from software documentation, utilizing Few-Shot Learning (FSL) to enable LLMs to generalize from a small number of examples. We compare the performance of LLMs with FSL to that of state-of-the-art specification extraction techniques and study the impact of prompt construction strategies on LLM performance. In addition, we conduct a comprehensive analysis of their symptoms and root causes of the failures to understand the pros and cons of LLMs and existing approaches. We also compare 11 LLMs' performance, cost, and response time for generating software specifications. Our findings include that (1) the best performing LLM outperforms existing approaches by 9.1--13.7% with a few similar examples, (2) the two dominant root causes combined (ineffective prompts and missing domain knowledge) result in 57--60% of LLM failures, and (3) most of the 11 LLMs achieve better or comparable performance compared to traditional techniques. Our study offers valuable insights for future research to improve specification generation

    Synthesis of Logic Programs from Object-Oriented Formal Specifications

    Get PDF
    Early validation of requirements is crucial for the rigorous development of software. Without it, even the most formal of the methodologies will produce the wrong outcome. One successful approach, popularised by some of the so-called lightweight formal methods, consists in generating (finite, small) models of the specifications. Another possibility is to build a running prototype from those specifications. In this paper we show how to obtain executable prototypes from formal specifications written in an object oriented notation by translating them into logic programs. This has some advantages over other lightweight methodologies. For instance, we recover the possibility of dealing with recursive data types as specifications that use them often lack finite models

    A Domain Analysis to Specify Design Defects and Generate Detection Algorithms

    Get PDF
    Quality experts often need to identify in software systems design defects, which are recurring design problems, that hinder development\ud and maintenance. Consequently, several defect detection approaches\ud and tools have been proposed in the literature. However, we are not\ud aware of any approach that defines and reifies the process of generating\ud detection algorithms from the existing textual descriptions of defects.\ud In this paper, we introduce an approach to automate the generation\ud of detection algorithms from specifications written using a domain-specific\ud language. The domain-specific is defined from a thorough domain analysis.\ud We specify several design defects, generate automatically detection\ud algorithms using templates, and validate the generated detection\ud algorithms in terms of precision and recall on Xerces v2.7.0, an\ud open-source object-oriented system

    A framework for pathologies of message sequence charts

    Get PDF
    This is the post-print version of the final paper published in Information Software and Technology. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2012 Elsevier B.V.Context - It is known that a Message Sequence Chart (MSC) specification can contain different types of pathology. However, definitions of different types of pathology and the problems caused by pathologies are unclear, let alone the relationships between them. In this circumstance, it can be problematic for software engineers to accurately predict the possible problems that may exist in implementations of MSC specifications and to trace back to the design problems in MSC specifications from the observed problems of an implementation. Objective - We focus on generating a clearer view on MSC pathologies and building formal relationships between pathologies and the problems that they may cause. Method - By concentrating on the problems caused by pathologies, a categorisation of problems that a distributed system may suffer is first introduced. We investigate the different types of problems and map them to categories of pathologies. Thus, existing concepts related to pathology are refined and necessary concepts in the pathology framework are identified. Finally, we formally prove the relationships between the concepts in the framework. Results - A pathology framework is established as desired based on a restriction that considers problematic scenarios with a single undesirable event. In this framework, we define disjoint categories of both pathologies and the problems caused; the identified types of pathology are successfully mapped to the problems that they may cause. Conclusion - The framework achieved in this paper introduces taxonomies into and clarifies relationships between concepts in research on MSC pathologies. The taxonomies and relationships in the framework can help software engineers to predict problems and verify MSC specifications. The single undesirable event restriction not only enables a categorisation of pathological scenarios, but also has the potential practical benefit that a software engineer can concentrate on key problematic scenarios. This may make it easier to either remove pathologies from an MSC specification MM or test an implementation developed from MM for potential problems resulting from such pathologies

    Mathematical Digital Terrain Model (DTM) Data for Testing DTM Generation Methodologies and Software

    Get PDF
    Developing a method for interpolation of data points for forming DTM involves thousands of complicated computations. Checking of such computations requires errorless and well-defined input data and end results from a large number of DTM points. Measured, or actual, DTM data points may not be available or invariably have errors which affect the course of the computations and analysis. Most of these problems can be overcome by mathematically generating DTM data points.  This paper focuses on developing software, called MathDTM, for generating mathematical (simulated) DTM data at desired specifications. An overview of the development and capabilities of MathDTM software platform based on Windows system is presented. The developed software was used for testing the performance and accuracy of different interpolation methods of Surfer software. The results of these tests were then discussed and evaluated

    A PVS-Simulink Integrated Environment for Model-Based Analysis of Cyber-Physical Systems

    Get PDF
    This paper presents a methodology, with supporting tool, for formal modeling and analysis of software components in cyber-physical systems. Using our approach, developers can integrate a simulation of logic-based specifications of software components and Simulink models of continuous processes. The integrated simulation is useful to validate the characteristics of discrete system components early in the development process. The same logic-based specifications can also be formally verified using the Prototype Verification System (PVS), to gain additional confidence that the software design complies with specific safety requirements. Modeling patterns are defined for generating the logic-based specifications from the more familiar automata-based formalism. The ultimate aim of this work is to facilitate the introduction of formal verification technologies in the software development process of cyber-physical systems, which typically requires the integrated use of different formalisms and tools. A case study from the medical domain is used to illustrate the approach. A PVS model of a pacemaker is interfaced with a Simulink model of the human heart. The overall cyber-physical system is co-simulated to validate design requirements through exploration of relevant test scenarios. Formal verification with the PVS theorem prover is demonstrated for the pacemaker model for specific safety aspects of the pacemaker design

    Learning Program Specifications from Sample Runs

    Get PDF
    With science fiction of yore being reality recently with self-driving cars, wearable computers and autonomous robots, software reliability is growing increasingly important. A critical pre-requisite to ensure the software that controls such systems is correct is the availability of precise specifications that describe a program\u27s intended behaviors. Generating these specifications manually is a challenging, often unsuccessful, exercise; unfortunately, existing static analysis techniques often produce poor quality specifications that are ineffective in aiding program verification tasks. In this dissertation, we present a recent line of work on automated synthesis of specifications that overcome many of the deficiencies that plague existing specification inference methods. Our main contribution is a formulation of the problem as a sample driven one, in which specifications, represented as terms in a decidable refinement type representation, are discovered from observing a program\u27s sample runs in terms of either program execution paths or input-output values, and automatically verified through the use of expressive refinement type systems. Our approach is realized as a series of inductive synthesis frameworks, which use various logic-based or classification-based learning algorithms to provide sound and precise machine-checked specifications. Experimental results indicate that the learning algorithms are both efficient and effective, capable of automatically producing sophisticated specifications in nontrivial hypothesis domains over a range of complex real-world programs, going well beyond the capabilities of existing solutions

    A Method and Tool Support for Generating SOFL Formal Specifications from Programs

    Get PDF
    Software systems developed in practice often lack an appropriate specification defining their functionality. This is also true in legacy software systems and many realistic software projects. Moreover, software review detects bugs, but the result of review depends on the reviewer. To deal with this problem, we put forward an approach to automatically generating formal specifications from source code as a step of reverse engineering. This includes transformations at two levels. One is to transform source code into Condition Data Flow Diagrams (CDFD) used in the Structured Object-Oriented Formal Language (SOFL) specification language, which includes mainly dealing with sequence, selection, and repetition. The other is to support the formation of formal specifications for operations involved in the CDFD and SOFL specification. SOFL is a formal engineering method. It provides a formal language which integrates structured method and object oriented language. By providing a tool support for generating SOFL specifications, we could provide a useful support for the construction of specifications in SOFL and help detect bugs in the source code. The tool can also help visualize the relations between operations for confirmation
    corecore