29 research outputs found

    Lossless, Persisted Summarization of Static Callgraph, Points-To and Data-Flow Analysis

    Get PDF
    Static analysis is used to automatically detect bugs and security breaches, and aids compiler optimization. Whole-program analysis (WPA) can yield high precision, however causes long analysis times and thus does not match common software-development workflows, making it often impractical to use for large, real-world applications. This paper thus presents the design and implementation of ModAlyzer, a novel static-analysis approach that aims at accelerating whole-program analysis by making the analysis modular and compositional. It shows how to compute lossless, persisted summaries for callgraph, points-to and data-flow information, and it reports under which circumstances this function-level compositional analysis outperforms WPA. We implemented ModAlyzer as an extension to LLVM and PhASAR, and applied it to 12 real-world C and C++ applications. At analysis time, ModAlyzer modularly and losslessly summarizes the analysis effect of the library code those applications share, hence avoiding its repeated re-analysis. The experimental results show that the reuse of these summaries can save, on average, 72% of analysis time over WPA. Moreover, because it is lossless, the module-wise analysis fully retains precision and recall. Surprisingly, as our results show, it sometimes even yields precision superior to WPA. The initial summary generation, on average, takes about 3.67 times as long as WPA

    Network-clustered multi-modal bug localization

    Get PDF
    Developers often spend much effort and resources to debug a program. To help the developers debug, numerous information retrieval (IR)-based and spectrum-based bug localization techniques have been devised. IR-based techniques process textual information in bug reports, while spectrum-based techniques process program spectra (i.e., a record of which program elements are executed for each test case). While both techniques ultimately generate a ranked list of program elements that likely contain a bug, they only consider one source of information--either bug reports or program spectra--which is not optimal. In light of this deficiency, this paper presents a new approach dubbed Network-clustered Multi-modal Bug Localization (NetML), which utilizes multi-modal information from both bug reports and program spectra to localize bugs. NetML facilitates an effective bug localization by carrying out a joint optimization of bug localization error and clustering of both bug reports and program elements (i.e., methods). The clustering is achieved through the incorporation of network Lasso regularization, which incentivizes the model parameters of similar bug reports and similar program elements to be close together. To estimate the model parameters of both bug reports and methods, NetML employs an adaptive learning procedure based on Newton method that updates the parameters on a per-feature basis. Extensive experiments on 355 real bugs from seven software systems have been conducted to benchmark NetML against various state-of-the-art localization methods. The results show that NetML surpasses the best-performing baseline by 31.82%, 22.35%, 19.72%, and 19.24%, in terms of the number of bugs successfully localized when a developer inspects the top 1, 5, and 10 methods and Mean Average Precision (MAP), respectively.Comment: IEEE Transactions on Software Engineerin

    ASSESSING LANGUAGE QUALITY IN THE INFORMATION SYSTEMS DEVELOPMENT PROCESS – A THEORETICAL APPROACH AND ITS APPLICATION

    Get PDF
    The necessary knowledge transfer and communication between project members is identified as a relevant issue in information systems development (ISD). Nevertheless, the impact of linguistic communication on ISD and requirements specification in its processual nature is still an open issue. In our research, we claim that effectiveness of ISD depends on the ability to manage how people deal with language in practice and reach a shared understanding. We propose the concept of language quality as a suitable means for analyzing the emergence of concise and meaningful requirements in ISD. By applying the thereby developed language quality dimensions on a real project, we were able to obtain practice-grounded propositions for practitioners to consider and for researchers to further evaluate the consequences of different actions on the interaction and communication processes for this particular field

    Empirical Studies of Android API Usage: Suggesting Related API Calls and Detecting License Violations.

    Get PDF
    We mine the API method calls used by Android App developers to (1)suggest related API calls based on the version history of Apps, (2) suggest related API calls based on StackOverflow posts, and (3) find potential App copyright and license vio- lations based the similarity of API calls made by them. Zimmermann et al suggested that �Programmers who changed these functions also changed� functions that could be mined from previous groupings of functions found in the version history of a system. Our first contribution is to expand this approach to a community of Apps. Android developers use a set of API calls when creating Apps. These API methods are used in similar ways across multiple applications. Clustering co-changing API methods used by 230 Android Apps, we are able to predict the changes to API methods that individual App developers will make to their application with an average precision of 73% and recall of 25%. Our second contribution can be characterized as �Programmers who discussed these functions were also interested in these functions.� Informal discussion on Stack- Overflow provides a rich source of related API methods as developers provide solu- tions to common problems. Clustering salient API methods in the same highly ranked posts, we are able to create rules that predict the changes App developers will make with an average precision of 64% and recall of 15%. Our last contribution is to find out whether proprietary Apps copy code from open source Apps, thereby violating the open source license. We have provided a set of techniques that determines how similar two Apps are based on the API calls they make. These techniques include android API calls matching, API calls coverage, App categories, Method/Class clusters and released size of Apps. To validate this approach we conduct a case study of 150 open source project and 950 proprietary projects

    Social aspects of collaboration in online software communities

    Get PDF
    corecore