Search CORE

9 research outputs found

Automatic Steering of Behavioral Model Inference

Author: LO David
Mariani Leonardo
Pezze Mauro
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

Many testing and analysis techniques use finite state mod-els to validate and verify the quality of software systems. Since the specification of such models is complex and time-consuming, researchers defined several techniques to extract finite state models from code and traces. Automatically generating models requires much less effort than designing them, and thus eases the verification and validation of large software systems. However, when models are inferred au-tomatically, the precision of the mining process is critical. Behavioral models mined with imprecise processes can in-clude many spurious behaviors, and can thus compromise the results of testing and analysis techniques that use those models. In this paper, we increase the precision of automata in-ferred from execution traces, by leveraging two learning tech-niques. We first mine execution traces to infer statistically significant temporal properties that capture relations be-tween non consecutive and possibly distant events. We then incrementally refine a simple initial automaton by merg-ing likely equivalent states. We identify equivalent states by analyzing set of consecutive events, and we use the in-ferred temporal properties to evaluate whether two equiv-alent states can be merged or not. We merge equivalent states only if the merging does violate any temporal prop-erty, since a merging that violates temporal properties is likely to introduce an imprecise generalization. Our gener-alization process that preserves temporal properties while merging states avoids breaking non-local relations, and thus solves one of the major cause of overgeneralized models. Thus, mined properties steer the learning of behavioral mod-els. The technique is completely automated and generates an automaton that both accepts the input traces and satis-fies the mined temporal properties. ∗This work has been partially supported by the Europea

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Mining preconditions of APIs in large-scale code corpus

Author: Dyer Robert
Nguyen Hoan
Nguyen Tien
Rajan Hridesh
Rajan Hridesh
Publication venue: Iowa State University Digital Repository
Publication date: 11/11/2014
Field of study

Modern software relies on existing application programming interfaces (APIs) from libraries. Formal specifications for the APIs enable many software engineering tasks as well as help developers correctly use them. In this work, we mine large-scale repositories of existing open-source software to derive potential preconditions for API methods. Our key idea is that APIs’ preconditions would appear frequently in an ultra-large code corpus with a large number of API usages, while project-specific conditions will occur less frequently. First, we find all client methods invoking APIs. We then compute a control dependence relation from each call site and mine the potential conditions used to reach those call sites. We use these guard conditions as a starting point to automatically infer the preconditions for each API. We analyzed almost 120 million lines of code from SourceForge and Apache projects to infer preconditions for the standard Java Development Kit (JDK) library. The results show that our technique can achieve high accuracy with recall from 75–80% and precision from 82–84%. We also found 5 preconditions missing from human written specifications. They were all confirmed by a specification expert. In a user study, participants found 82% of the mined preconditions as a good starting point for writing specifications. Using our mining result, we also built a benchmark of more than 4,000 precondition-related bugs

Digital Repository @ Iowa State University (ISU)

A framework for the evaluation of specification miners based on finite state machines

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Synergizing specification miners through model fissions and fusions

Author: BESCHASTNIKH Ivan
David LO
LE BUI TIEN DUY
LE DINH XUAN BACH
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Abstract—Software systems are often developed and released without formal specifications. For those systems that are formally specified, developers have to continuously maintain and update the specifications or have them fall out of date. To deal with the absence of formal specifications, researchers have proposed tech-niques to infer the missing specifications of an implementation in a variety of forms, such as finite state automaton (FSA). Despite the progress in this area, the efficacy of the proposed specification miners needs to improve if these miners are to be adopted. We propose SpecForge, a new specification mining approach that synergizes many existing specification miners. SpecForge decomposes FSAs that are inferred by existing miners into simple constraints, through a process we refer to as model fission. It then filters the outlier constraints and fuses the constraints back together into a single FSA (i.e., model fusion). We have evaluated SpecForge on execution traces of 10 programs, which includes 5 programs from DaCapo benchmark, to infer behavioral models of 13 library classes. Our results show that SpecForge achieves an average precision, recall and F-measure of 90.57%, 54.58%, and 64.21 % respectively. SpecForge outperforms the best performing baseline by 13.75 % in terms of F-measure

CiteSeerX

Institutional Knowledge at Singapore Management University

An interview study about the use of logs in embedded software engineering

Author: Cuijpers Pieter
Hendriks Dennis
Lukkien Johan
Schiffelers Ramon
Serebrenik Alexander
Yang Nan
Publication venue
Publication date: 11/02/2023
Field of study

Context: Execution logs capture the run-time behavior of software systems. To assist developers in their maintenance tasks, many studies have proposed tools to analyze execution information from logs. However, it is as yet unknown how industry developers use logs in embedded software engineering. Objective: In this study, we aim to understand how developers use logs in an embedded software engineering context. Specifically, we would like to gain insights into the type of logs developers analyze, the purposes for which developers analyze logs, the information developers need from logs and their expectation on tool support. Method: In order to achieve the aim, we conducted these interview studies. First, we interviewed 25 software developers from ASML, which is a leading company in developing lithography machines. This exploratory case study provides the preliminary findings. Next, we validated and refined our findings by conducting a replication study. We involved 14 interviewees from four companies who have different software engineering roles in their daily work. Results: As the result of our first study, we compile a preliminary taxonomy which consists of four types of logs used by developers in practice, 18 purposes of using logs, 13 types of information developers search in logs, 13 challenges faced by developers in log analysis and three suggestions for tool support provided by developers. This taxonomy is refined in the replication study with three additional purposes, one additional information need, four additional challenges and three additional suggestions of tool support. In addition, with these two studies, we observed that text-based editors and self-made scripts are commonly used when it comes to tooling in log analysis practice. As indicated by the interviewees, the development of automatic analysis tools is hindered by the quality of the logs, which further suggests several challenges in log instrumentation and management. Conclusions: Based on our study, we provide suggestions for practitioners on logging practices. We provide implications for tool builders on how to further improve tools based on existing techniques. Finally, we suggest some research directions and studies for researchers to further study software logging.</p

Pure OAI Repository

Recommended from our members

Semantics-driven extraction of timed automata from Java programs

Author: Khan Muhammad Taimoor
Liva Giovanni
Pinzger Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2019
Field of study

The automatic verification of time properties of models extracted from programs is challenging, mainly because modern programming languages, such as Java, represent time without a proper semantics. Current approaches to extract time models from source code either represent time only as a tree-like sequence of events or require developers to manually provide a formal model of the time behavior. This makes it difficult for software developers to verify various aspects of their systems, such as timeouts, delays and periodicity of the execution. In this paper, we introduce a formal definition of the time semantics for the Java programming language. Based on the semantics, we present an approach to automatically extract timed automata and their time constraints from Java programs at method level. First, our approach detects the Java statements that involve time, from which it then extracts the timed automata. Our extracted automata are directly amenable to the verification of time properties of the corresponding Java methods. We evaluated the accuracy of our approach on twenty open source Java projects that implement time behavior in their source code. The results show that our approach achieves 100% precision and recall in identifying time related information. They also show that 95% of the timed automata extracted from source code correctly model the time behavior of the method. Finally, we show the applicability of our timed automata to identify eight real errors in four open source Apache systems

Greenwich Academic Literature Archive

Analyzing repetitiveness in big code to support software maintenance and evolution

Author: Nguyen Hoan Anh
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2015
Field of study

Software systems inevitably contain a large amount of repeated artifacts at different level of abstraction---from ideas, requirements, designs, algorithms to implementation. This dissertation focuses on analyzing software repetitiveness at implementation code level and leveraging the derived knowledge for easing tasks in software maintenance and evolution such as program comprehension, API use, change understanding, API adaptation and bug fixing. The guiding philosophy of this work is that, in a large corpus, code that conforms to specifications appears more frequently than code that does not, and similar code is changed similarly and similar code could have similar bugs that can be fixed similarly. We have developed different representations for software artifacts at source code level, and the corresponding algorithms for measuring code similarity and mining repeated code. Our mining techniques bases on the key insight that code that conforms to programming patterns and specifications appears more frequently than code that does not. Thus, correct patterns and specifications can be mined from large code corpus. We also have built program differencing techniques for analyzing changes in software evolution. Our key insight is that similar code is likely changed in similar ways and similar code likely has similar bug(s) which can be fixed similarly. Therefore, learning changes and fixes from the past can help automatically detect and suggest changes/fixes to the repeated code in software development. Our empirical evaluation shows that our techniques can accurately and efficiently detect repeated code, mine useful programming patterns and API specifications, and recommend changes. It can also detect bugs and suggest fixes, and provide actionable insights to ease maintenance tasks. Specifically, our code clone detection tool detects more meaningful clones than other tools. Our mining tools recover high quality programming patterns and API preconditions. The mined results have been used to successfully detect many bugs violating patterns and specifications in mature open-source systems. The mined API preconditions are shown to help API specification writer identify missing preconditions in already-specified APIs and start building preconditions for the not-yet-specified ones. The tools are scalable which analyze large systems in reasonable times. Our study on repeated changes give useful insights for program auto-repair tools. Our automated change suggestion approach achieves top-1 accuracy of 45%-51% which relatively improves more than 200% over the base approach. For a special type of change suggestion, API adaptation, our tool is highly correct and useful

Digital Repository @ Iowa State University (ISU)