9 research outputs found

    Design implications for task-specific search utilities for retrieval and re-engineering of code

    Get PDF
    The importance of information retrieval systems is unquestionable in the modern society and both individuals as well as enterprises recognise the benefits of being able to find information effectively. Current code-focused information retrieval systems such as Google Code Search, Codeplex or Koders produce results based on specific keywords. However, these systems do not take into account developers’ context such as development language, technology framework, goal of the project, project complexity and developer’s domain expertise. They also impose additional cognitive burden on users in switching between different interfaces and clicking through to find the relevant code. Hence, they are not used by software developers. In this paper, we discuss how software engineers interact with information and general-purpose information retrieval systems (e.g. Google, Yahoo!) and investigate to what extent domain-specific search and recommendation utilities can be developed in order to support their work-related activities. In order to investigate this, we conducted a user study and found that software engineers followed many identifiable and repeatable work tasks and behaviours. These behaviours can be used to develop implicit relevance feedback-based systems based on the observed retention actions. Moreover, we discuss the implications for the development of task-specific search and collaborative recommendation utilities embedded with the Google standard search engine and Microsoft IntelliSense for retrieval and re-engineering of code. Based on implicit relevance feedback, we have implemented a prototype of the proposed collaborative recommendation system, which was evaluated in a controlled environment simulating the real-world situation of professional software engineers. The evaluation has achieved promising initial results on the precision and recall performance of the system

    From Query to Usable Code: An Analysis of Stack Overflow Code Snippets

    Full text link
    Enriched by natural language texts, Stack Overflow code snippets are an invaluable code-centric knowledge base of small units of source code. Besides being useful for software developers, these annotated snippets can potentially serve as the basis for automated tools that provide working code solutions to specific natural language queries. With the goal of developing automated tools with the Stack Overflow snippets and surrounding text, this paper investigates the following questions: (1) How usable are the Stack Overflow code snippets? and (2) When using text search engines for matching on the natural language questions and answers around the snippets, what percentage of the top results contain usable code snippets? A total of 3M code snippets are analyzed across four languages: C\#, Java, JavaScript, and Python. Python and JavaScript proved to be the languages for which the most code snippets are usable. Conversely, Java and C\# proved to be the languages with the lowest usability rate. Further qualitative analysis on usable Python snippets shows the characteristics of the answers that solve the original question. Finally, we use Google search to investigate the alignment of usability and the natural language annotations around code snippets, and explore how to make snippets in Stack Overflow an adequate base for future automatic program generation.Comment: 13th IEEE/ACM International Conference on Mining Software Repositories, 11 page

    Squirrel: A Code Snippet Repository

    Get PDF
    Teaching programming courses in an academic environment has challenges, particularly for undergraduate students. These challenges can also be found in the software industry, where novice developers still need to obtain coding training. In practice, lecturers or trainers always assign exercises to students with the goal of improving their coding skills. Based on teaching experience, we have found that students may need to rely on software development kits to code an application faster, such as reusing lines of code already developed (code snippets) to solve a particular problem. Additionally, to reduce the software development time, novice developers tend to use crowdsourcing and social media sources of example code that can be used to solve programming problems. Utilizing these code snippets not only saves time for students or novice developers but also helps them to learn. Therefore, we proposed a system, called Squirrel, that is capable of offline and online code snippet storage and searching. Additionally, for online functionalities, this application supports searching for code snippets that exist in question-and-answer websites, such as Stack Overflow. We believe that this system will help increase the effectiveness of software development, especially for computer-science students and novice developers

    Software search is not a science, even among scientists: A survey of how scientists and engineers find software

    Get PDF
    Improved software discovery is a prerequisite for greater software reuse: after all, if someone cannot find software for a particular task, they cannot reuse it. Understanding people’s approaches and preferences when they look for software could help improve facilities for software discovery. We surveyed people working in several scientific and engineering fields to better understand their approaches and selection criteria. We found that even among highly-trained people, the rudimentary approaches of relying on general Web searches, the opinions of colleagues, and the literature were still the most commonly used. However, those who were involved in software development differed from nondevelopers in their use of social help sites, software project repositories, software catalogs, and organization-specific mailing lists or forums. For example, software developers in our sample were more likely to search in community sites such as Stack Overflow even when seeking ready-to-run software rather than source code, and likewise, asking colleagues was significantly more important when looking for ready-to-run software. Our survey also provides insight into the criteria that matter most to people when they are searching for ready-to-run software. Finally, our survey also identifies some factors that can prevent people from finding software

    Supporting an Online community of amateur creators

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 169-176).This work describes a framework for the design and study of an online community of amateur creators. I focus on remixing as the lens to understand the contexts and processes of creative expression as it is fostered within social media environments. I am motivated by three broad questions: 1) Process: how do people remix and what is the role of remixing in cultural production and social learning? 2) Conditions: what kind of attributes influence people's remixing practices? 3) Attitudes: what are people's attitudes toward remixing? As part of this work, I conceived, developed and studied the Scratch Online Community: a website where young people share and remix their own video games and animations, as well as those of their peers. In five years, the community has grown to more than one million registered members and two million community-contributed projects. In the spirit of the theme of this work, this dissertation remixes several articles and blog posts written by myself or in collaboration with others. Wherever possible, the sources of the material are noted.by Andrés Monroy-Hernández.Ph.D

    Software search is not a science, even among scientists: A survey of how scientists and engineers find software

    Get PDF
    Improved software discovery is a prerequisite for greater software reuse: after all, if someone cannot find software for a particular task, they cannot reuse it. Understanding people’s approaches and preferences when they look for software could help improve facilities for software discovery. We surveyed people working in several scientific and engineering fields to better understand their approaches and selection criteria. We found that even among highly-trained people, the rudimentary approaches of relying on general Web searches, the opinions of colleagues, and the literature were still the most commonly used. However, those who were involved in software development differed from nondevelopers in their use of social help sites, software project repositories, software catalogs, and organization-specific mailing lists or forums. For example, software developers in our sample were more likely to search in community sites such as Stack Overflow even when seeking ready-to-run software rather than source code, and likewise, asking colleagues was significantly more important when looking for ready-to-run software. Our survey also provides insight into the criteria that matter most to people when they are searching for ready-to-run software. Finally, our survey also identifies some factors that can prevent people from finding software

    Empirical Studies of Android API Usage: Suggesting Related API Calls and Detecting License Violations.

    Get PDF
    We mine the API method calls used by Android App developers to (1)suggest related API calls based on the version history of Apps, (2) suggest related API calls based on StackOverflow posts, and (3) find potential App copyright and license vio- lations based the similarity of API calls made by them. Zimmermann et al suggested that �Programmers who changed these functions also changed� functions that could be mined from previous groupings of functions found in the version history of a system. Our first contribution is to expand this approach to a community of Apps. Android developers use a set of API calls when creating Apps. These API methods are used in similar ways across multiple applications. Clustering co-changing API methods used by 230 Android Apps, we are able to predict the changes to API methods that individual App developers will make to their application with an average precision of 73% and recall of 25%. Our second contribution can be characterized as �Programmers who discussed these functions were also interested in these functions.� Informal discussion on Stack- Overflow provides a rich source of related API methods as developers provide solu- tions to common problems. Clustering salient API methods in the same highly ranked posts, we are able to create rules that predict the changes App developers will make with an average precision of 64% and recall of 15%. Our last contribution is to find out whether proprietary Apps copy code from open source Apps, thereby violating the open source license. We have provided a set of techniques that determines how similar two Apps are based on the API calls they make. These techniques include android API calls matching, API calls coverage, App categories, Method/Class clusters and released size of Apps. To validate this approach we conduct a case study of 150 open source project and 950 proprietary projects
    corecore