27 research outputs found

    Source File Set Search for Clone-and-Own Reuse Analysis

    Get PDF
    Clone-and-own approach is a natural way of source code reuse for software developers. To assess how known bugs and security vulnerabilities of a cloned component affect an application, developers and security analysts need to identify an original version of the component and understand how the cloned component is different from the original one. Although developers may record the original version information in a version control system and/or directory names, such information is often either unavailable or incomplete. In this research, we propose a code search method that takes as input a set of source files and extracts all the components including similar files from a software ecosystem (i.e., a collection of existing versions of software packages). Our method employs an efficient file similarity computation using b-bit minwise hashing technique. We use an aggregated file similarity for ranking components. To evaluate the effectiveness of this tool, we analyzed 75 cloned components in Firefox and Android source code. The tool took about two hours to report the original components from 10 million files in Debian GNU/Linux packages. Recall of the top-five components in the extracted lists is 0.907, while recall of a baseline using SHA-1 file hash is 0.773, according to the ground truth recorded in the source code repositories.Comment: 14th International Conference on Mining Software Repositorie

    Reducing Errors in Excel Models with Component-Based Software Engineering

    Full text link
    Model errors are pervasive and can be catastrophic. We can reduce model errors and time to market by applying Component-Based Software Engineering (CBSE) concepts to Excel models. CBSE assembles solutions from pre-built, pre-tested components rather than written from formulas. This is made possible by the introduction of LAMBDA. LAMBDA is an Excel function that creates functions from Excel's formulas. CBSE-compliant LAMBDA functions can be reused in any project just like any Excel function. They also look exactly like Excel's native functions such as SUM(). This makes it possible for even junior modelers to leverage CBSE-compliant LAMBDAs to develop models quicker with fewer errors.Comment: 27 page

    Using Peer Comparison Approaches to Measure Software Stability

    Get PDF
    Software systems must change to adapt to new functional requirements and new nonfunctional requirements. This is called software revision. However, not all the modules within the system need to be changed during each revision. In this paper, we study how frequently each module is modified. Our study is performed through comparing the stability of peer software modules. The study is performed on six open-source Java projects: Ant, Flow4j, Jena, Lucence, Struct, and Xalan, in which classes are identified as basic software modules. Our study shows (1) about half of the total classes never changed; (2) frequent changes occur to small number of classes; and (3) the number of changed classes between current release and next release has no significant relations with the time duration between current release and next release. Keywords: software evolution; software revision; software stability; class stability; open-source project; Java clas

    A Recommender Agent for Software Libraries: An Evaluation of Memory-Based and Model-Based Collaborative Filtering

    Full text link
    Abstract—Software Agents can conveniently facilitate knowl-edge discovery and knowledge sharing across an organisation. We contend that programming tasks are often mimicked, that knowledge concerning reusable libraries can be extracted auto-matically from source code repositories, and that this knowledge can then be filtered and presented to a developer in a manner that will encourage and support future software reuse. We introduce RASCAL, a recommender agent that continually recommends a set of task relevant library methods to a developer. RASCAL learns information regarding how a particular reusable library is used and then employs this insight to make task relevant recommendations to a developer. In this paper we detail our RASCAL agent and describe two recommendation techniques; namely Model-Based and Memory-Based Collabora-tive Filtering. We are interested in producing a scalable and efficient realtime recommender and thus ideally would favor a Model-Based approach. However, each scheme is evaluated against both runtime performance and recommendation accu-racy. We present results and discuss the merits and limitations of each technique. I
    corecore