29 research outputs found
Lossless, Persisted Summarization of Static Callgraph, Points-To and Data-Flow Analysis
Static analysis is used to automatically detect bugs and security breaches, and aids compiler optimization. Whole-program analysis (WPA) can yield high precision, however causes long analysis times and thus does not match common software-development workflows, making it often impractical to use for large, real-world applications.
This paper thus presents the design and implementation of ModAlyzer, a novel static-analysis approach that aims at accelerating whole-program analysis by making the analysis modular and compositional. It shows how to compute lossless, persisted summaries for callgraph, points-to and data-flow information, and it reports under which circumstances this function-level compositional analysis outperforms WPA.
We implemented ModAlyzer as an extension to LLVM and PhASAR, and applied it to 12 real-world C and C++ applications. At analysis time, ModAlyzer modularly and losslessly summarizes the analysis effect of the library code those applications share, hence avoiding its repeated re-analysis. The experimental results show that the reuse of these summaries can save, on average, 72% of analysis time over WPA. Moreover, because it is lossless, the module-wise analysis fully retains precision and recall. Surprisingly, as our results show, it sometimes even yields precision superior to WPA. The initial summary generation, on average, takes about 3.67 times as long as WPA
Network-clustered multi-modal bug localization
Developers often spend much effort and resources to debug a program. To help
the developers debug, numerous information retrieval (IR)-based and
spectrum-based bug localization techniques have been devised. IR-based
techniques process textual information in bug reports, while spectrum-based
techniques process program spectra (i.e., a record of which program elements
are executed for each test case). While both techniques ultimately generate a
ranked list of program elements that likely contain a bug, they only consider
one source of information--either bug reports or program spectra--which is not
optimal. In light of this deficiency, this paper presents a new approach dubbed
Network-clustered Multi-modal Bug Localization (NetML), which utilizes
multi-modal information from both bug reports and program spectra to localize
bugs. NetML facilitates an effective bug localization by carrying out a joint
optimization of bug localization error and clustering of both bug reports and
program elements (i.e., methods). The clustering is achieved through the
incorporation of network Lasso regularization, which incentivizes the model
parameters of similar bug reports and similar program elements to be close
together. To estimate the model parameters of both bug reports and methods,
NetML employs an adaptive learning procedure based on Newton method that
updates the parameters on a per-feature basis. Extensive experiments on 355
real bugs from seven software systems have been conducted to benchmark NetML
against various state-of-the-art localization methods. The results show that
NetML surpasses the best-performing baseline by 31.82%, 22.35%, 19.72%, and
19.24%, in terms of the number of bugs successfully localized when a developer
inspects the top 1, 5, and 10 methods and Mean Average Precision (MAP),
respectively.Comment: IEEE Transactions on Software Engineerin
ASSESSING LANGUAGE QUALITY IN THE INFORMATION SYSTEMS DEVELOPMENT PROCESS – A THEORETICAL APPROACH AND ITS APPLICATION
The necessary knowledge transfer and communication between project members is identified as a relevant issue in information systems development (ISD). Nevertheless, the impact of linguistic communication on ISD and requirements specification in its processual nature is still an open issue. In our research, we claim that effectiveness of ISD depends on the ability to manage how people deal with language in practice and reach a shared understanding. We propose the concept of language quality as a suitable means for analyzing the emergence of concise and meaningful requirements in ISD. By applying the thereby developed language quality dimensions on a real project, we were able to obtain practice-grounded propositions for practitioners to consider and for researchers to further evaluate the consequences of different actions on the interaction and communication processes for this particular field
MORE: A multi‐objective refactoring recommendation approach to introducing design patterns and fixing code smells
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/137556/1/smr1843.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/137556/2/smr1843_am.pd
Empirical Studies of Android API Usage: Suggesting Related API Calls and Detecting License Violations.
We mine the API method calls used by Android App developers to (1)suggest related API calls based on the version history of Apps, (2) suggest related API calls based on StackOverflow posts, and (3) find potential App copyright and license vio- lations based the similarity of API calls made by them.
Zimmermann et al suggested that �Programmers who changed these functions also changed� functions that could be mined from previous groupings of functions found in the version history of a system. Our first contribution is to expand this approach to a community of Apps. Android developers use a set of API calls when creating Apps. These API methods are used in similar ways across multiple applications. Clustering co-changing API methods used by 230 Android Apps, we are able to predict the changes to API methods that individual App developers will make to their application with an average precision of 73% and recall of 25%.
Our second contribution can be characterized as �Programmers who discussed these functions were also interested in these functions.� Informal discussion on Stack- Overflow provides a rich source of related API methods as developers provide solu- tions to common problems. Clustering salient API methods in the same highly ranked posts, we are able to create rules that predict the changes App developers will make with an average precision of 64% and recall of 15%.
Our last contribution is to find out whether proprietary Apps copy code from open source Apps, thereby violating the open source license. We have provided a set of techniques that determines how similar two Apps are based on the API calls they make. These techniques include android API calls matching, API calls coverage, App categories, Method/Class clusters and released size of Apps. To validate this approach we conduct a case study of 150 open source project and 950 proprietary projects
Recommended from our members
The Effectiveness of <i>t</i>-Way Test Data Generation
Modern society is increasingly dependent on the correct functioning of software and increasingly so in areas that are considered safety related or safety critical. Therefore, there is an increasing need to be able to verify and validate that the software is in fact correct and will perform its intended function. Many approaches to this problem have been proposed; however, none seems likely to supplant the role of testing in the near future.
If we accept that there is, and will be, a continuing need to be able to test software then the question becomes one of how can this be done effectively, both in terms of ability to detect errors and in terms of cost. One avenue of research that offers prospects of improving both of these aspects is the automatic generation of test data.
There has recently been a large amount of work conducted in this area. One particularly promising direction has been the application of ideas from the field of experimental design and in particular, the field of t-way adequate factorial designs.
The area however, is not without issues; there is evidence that the technique is capable of detecting errors but that evidence is not unequivocal. Moreover, as with almost all work in the area of automatic test generation, there has been very little comparative work comparing the technique with other test data generation techniques. Worse, there has been effectively no work done that compares any automatic test data generation technique with the effectiveness of tests generated by humans. Another major issue with the technique is the number of tests that applying the technique can result in. This implies that there is a need for an automated oracle if the technique is to be successfully applied. The flaw with this is of course that in most situations the oracle is the human that is conducting the tests, a point often ignored in testing research.
The work presented here addresses both of these points. To do this I have used a code base taken from an industrial engine control system that has an existing set of high quality unit tests developed by hand. To complement this, several other techniques for automatically generating test data have been applied, namely random testing, random experimental designs and a technique for generating single factor experiments. To address the issue of being able to compare the error detection ability of all of the sets of test vectors, rather than the usual effectiveness surrogates of code coverage I have used mutation analysis on the code base to directly measure the ability of each set of test vectors to discover common coding errors. The results presented here show that test data generation techniques based on t-way factorial designs are at least as effective as handgenerated tests and superior to random testing and the factor experimental technique.
The oracle problem associated with the factorial design techniques was addressed using a test set minimisation approach. The mutation tool monitored which vectors could “kill” which code mutants. After a subset of the test vectors had been run, the most effective vectors were retained and the rest discarded. Likewise, mutants that were killed were removed from further consideration and the process repeated. Experimental results show that this minimisation procedure is effective at reducing computational overhead and is capable of producing final sets of test vectors that are comparable in size with the sets of hand-generated tests and so amenable to final hand checking