3,258 research outputs found

    A Survey on Query-based API Recommendation

    Full text link
    Application Programming Interfaces (APIs) are designed to help developers build software more effectively. Recommending the right APIs for specific tasks has gained increasing attention among researchers and developers in recent years. To comprehensively understand this research domain, we have surveyed to analyze API recommendation studies published in the last 10 years. Our study begins with an overview of the structure of API recommendation tools. Subsequently, we systematically analyze prior research and pose four key research questions. For RQ1, we examine the volume of published papers and the venues in which these papers appear within the API recommendation field. In RQ2, we categorize and summarize the prevalent data sources and collection methods employed in API recommendation research. In RQ3, we explore the types of data and common data representations utilized by API recommendation approaches. We also investigate the typical data extraction procedures and collection approaches employed by the existing approaches. RQ4 delves into the modeling techniques employed by API recommendation approaches, encompassing both statistical and deep learning models. Additionally, we compile an overview of the prevalent ranking strategies and evaluation metrics used for assessing API recommendation tools. Drawing from our survey findings, we identify current challenges in API recommendation research that warrant further exploration, along with potential avenues for future research

    Extracting Tasks from Customize Portal using Natural Language Processing

    Get PDF
    In software documentation, product knowledge and software requirement are very important to improve product quality. Within maintenance stage, reading of whole documentation of large corpus won’t be possible by developers. They need to receive software documentation i.e. (development, designing and testing etc.) in a short period of time. Important documents are able to record in software documentation. There live a space between information which developer wants and software documentation. To solve this problem, an approach for extracting relevant task that is based on heuristically matching the structure of the documentation under three phases of software documentation (i.e. documentation, development and testing) is described. Our main idea is that task is extracted automatically from the software documentation, freeing the developer easily get the required data from software documentation with customize portal using WordNet library and machine learning technique. And then the category of task can be generated easily from existing applications using natural language processing. Our approach use WordNet library to identify relevant tasks for calculating frequency of each word which allows developers in a piece of software to discover the word usage

    Context-Sensitive Code Completion

    Get PDF
    Developers depend extensively on software frameworks and libraries to deliver the products on time. While these frameworks and libraries support software reuse, save development time, and reduce the possibility of introducing errors, they do not come without a cost. Developers need to learn and remember Application Programming Interfaces (APIs) for effectively using those frameworks and libraries. However, APIs are difficult to learn and use. This is mostly due to APIs being large in number, they may not be properly documented, and finally there exist complex relationships between various classes and methods that make APIs difficult to learn. To support developers using those APIs, this thesis focuses on the code completion feature of modern integrated development environments (IDEs). As a developer types code, a code completion system offers a list of completion proposals through a popup menu to navigate and select. This research aims to improve the current state of code completion systems in discovering APIs. Towards this direction, a case study on tracking source code lines has been conducted to better understand capturing code context and to evaluate the benefits of using the simhash technique. Observations from the study have helped to develop a simple, context-sensitive method call completion technique, called CSCC. The technique is compared with a large number of existing code completion techniques. The notion of context proposed in CSCC can even outweigh graph-based statistical language models. Existing method call completion techniques leave the task of completing method parameters to developers. To address this issue, this thesis has investigated how developers complete method parameters. Based on the analysis, a method parameter completion technique, called PARC, has been developed. To date, the technique supports the largest number of expressions to complete method parameters. The technique has been implemented as an Eclipse plug-in that demonstrates the proof of the concept. To meet application-specific requirements, software frameworks need to be customized via extension points. It was observed that developers often pass a framework related object as an argument to an API call to customize default aspects of application frameworks. To enable such customizations, the object can be created by extending a framework class, implementing an interface, or changing the properties of the object via API calls. However, it is both a common and non-trivial task to find all the details related to the customizations. To address this issue, a technique has been developed, called FEMIR. The technique utilizes partial program analysis and graph mining technique to detect, group, and rank framework extension examples. The tool extends existing code completion infrastructure to inform developers about customization choices, enabling them to browse through extension points of a framework, and frequent usages of each point in terms of code examples. Findings from this research and proposed techniques have the potential to help developers to learn different aspects of APIs, thus ease software development, and improve the productivity of developers

    Stepwise API usage assistance based on N-gram language models

    Get PDF
    Software development requires the use of external Application Programming Interfaces (APIs) in order to reuse libraries and frameworks. Programmers often struggle with unfamiliar APIs due to their lack of resources or less common design. Such difficulties often lead to an incorrect sequences of API calls that may not produce the desired outcome. Language models have shown the ability to capture regularities in text as well as in code. In this work we explore the use of n-gram language models and their ability to capture regularities in API usage through an intrinsic and extrinsic evaluation of these models on some of the most widely used APIs for the Java programming language. To achieve this, several language models were trained over a source code corpora containing several hundreds of GitHub Java projects that use the desired APIs. In order to fully assess the performance of the language models, we have selected APIs from multiple domains and vocabulary sizes. This work allowed us to conclude that n-gram language models are able to capture the API usage patterns due to their low perplexity values and their high overall coverage, going up to 100% in some cases, which encouraged us to create a code completion tool to help programmers stay in the right path when using unknown APIs while allowing for some exploration.O desenvolvimento de software requer a utilização de Application Programming Interfaces (APIs) externas com o objectivo de reutilizar bibliotecas e frameworks. Muitas vezes, os programadores têm dificuldade em utilizar APIs desconhecidas, devido à falta de recursos ou desenho fora do comum. Essas dificuldades provocam inúmeras vezes sequências incorrectas de chamadas às APIs que poderão não produzir o resultado desejado. Os modelos de língua mostraram-se capazes de capturar regularidades em texto, bem como em código. Neste trabalho é explorada a utilização de modelos de língua de n-gramas e a sua capacidade de capturar regularidades na utilização de APIs, através de uma avaliação intrínseca e extrínseca destes modelos em algumas das APIs mais utilizadas na linguagem de programação Java. Para alcançar este objectivo, vários modelos foram treinados sobre repositórios de código do GitHub, contendo centenas de projectos Java que utilizam estas APIs. Com o objectivo de ter uma avaliação completa do desempenho dos modelos de língua, foram seleccionadas APIs de múltiplos domínios e tamanhos de vocabulário. Este trabalho permite concluir que os modelos de língua de n-gramas são capazes de capturar padrões de utilização de APIs devido aos seus baixos valores de perplexidade e a sua alta cobertura, chegando a atingir 100% em alguns casos, o que levou à criação de uma ferramenta de code completion para guiar os programadores na utilização de uma API desconhecida, mas mantendo a possibilidade de a explorar

    Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit

    Full text link
    Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language processing, and programming languages. In this paper, we conduct a comprehensive literature review on deep learning for code intelligence, from the aspects of code representation learning, deep learning techniques, and application tasks. We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models. In particular, we inspect the existing code intelligence models under the basis of code representation learning, and provide a comprehensive overview to enhance comprehension of the present state of code intelligence. Furthermore, we publicly release the source code and data resources to provide the community with a ready-to-use benchmark, which can facilitate the evaluation and comparison of existing and future code intelligence models (https://xcodemind.github.io). At last, we also point out several challenging and promising directions for future research
    corecore