445 research outputs found

    Towards Automatic Generation of Short Summaries of Commits

    Full text link
    Committing to a version control system means submitting a software change to the system. Each commit can have a message to describe the submission. Several approaches have been proposed to automatically generate the content of such messages. However, the quality of the automatically generated messages falls far short of what humans write. In studying the differences between auto-generated and human-written messages, we found that 82% of the human-written messages have only one sentence, while the automatically generated messages often have multiple lines. Furthermore, we found that the commit messages often begin with a verb followed by an direct object. This finding inspired us to use a "verb+object" format in this paper to generate short commit summaries. We split the approach into two parts: verb generation and object generation. As our first try, we trained a classifier to classify a diff to a verb. We are seeking feedback from the community before we continue to work on generating direct objects for the commits.Comment: 4 pages, accepted in ICPC 2017 ERA Trac

    A Neural Model for Generating Natural Language Summaries of Program Subroutines

    Full text link
    Source code summarization -- creating natural language descriptions of source code behavior -- is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance. Traditional techniques relied on heuristics and templates built manually by human experts. Recently, data-driven approaches based on neural machine translation have largely overtaken template-based systems. But nearly all of these techniques rely almost entirely on programs having good internal documentation; without clear identifier names, the models fail to create good summaries. In this paper, we present a neural model that combines words from code with code structure from an AST. Unlike previous approaches, our model processes each data source as a separate input, which allows the model to learn code structure independent of the text in code. This process helps our approach provide coherent summaries in many cases even when zero internal documentation is provided. We evaluate our technique with a dataset we created from 2.1m Java methods. We find improvement over two baseline techniques from SE literature and one from NLP literature

    The current situation and management of idle rural homesteads in China - based on a survey in Jiangxi province

    Get PDF
    Generally, China is still in the middle accelerating stage of urbanization. Rural idle homesteads are the main problems of rural areas in China, according to two elements (the population and land) can be divided into two types: the first is one household with houses, and the second is the population migration. Through the research questionnaire and interview analysis, the authors know that the traditional land concept is still deeply rooted among farmers. The phenomena of building houses only along the roads and the multi-story ostentation are prominent. The needs of traditional agricultural production have become a major obstacle to the management system of idle homesteads. The root cause for idle homesteads is the inevitable result of social and economic development, but also because the current law lags and inadequate management systems are not regulated properly, that is why it is becoming more and more serious. The authors suggest that the management system of idle homesteads should be divided into three steps based on villager autonomy: the first step is to promote the voluntary withdraw system of idle homesteads, the second step is to issue homesteads use right certificates, the third step is the classification of the ways of idle homesteads use

    The Path and Enlightenment of Data-Driven Digital Transformation of Organizational Learning ——A Case Study of the Practice of China Telecom

    Get PDF
    This paper took China Telecom as a case. It has analyzed data-driven digital transformation in organizational learning, and summarized the methods and enlightenments of digital transformation

    Fabrication and characterizations of proton-exchanged LiNbO3 waveguides fabricated by inductively coupled plasma technique

    Get PDF
    This Letter reports the use of an inductively coupled plasma technique for fabrication of proton-exchanged (PE) LiNbO3 (LN) waveguides. Planar and stripe waveguides have been formed in Y-cut LN which are difficult to obtain with the conventional molten acid method due to the occurrence of surface damage. Secondary ion mass spectrometry, scanning electron microscopy, and infrared absorption spectrum characterization results revealed that a uniform vertical PE profile with a single low order crystal phase has been directly obtained as a result of this unique process. X-ray photoelectron spectroscopy characterization of the treated surface revealed the existence of NbO as the cause for a sometimes darkened surface and confirms the ability to completely restore the surface to LN by oxygen plasma treatment. Atomic force microscopy measurement confirms that good surface quality has been maintained after regeneration of the surface to LN

    Statement-based Memory for Neural Source Code Summarization

    Full text link
    Source code summarization is the task of writing natural language descriptions of source code behavior. Code summarization underpins software documentation for programmers. Short descriptions of code help programmers understand the program quickly without having to read the code itself. Lately, neural source code summarization has emerged as the frontier of research into automated code summarization techniques. By far the most popular targets for summarization are program subroutines. The idea, in a nutshell, is to train an encoder-decoder neural architecture using large sets of examples of subroutines extracted from code repositories. The encoder represents the code and the decoder represents the summary. However, most current approaches attempt to treat the subroutine as a single unit. For example, by taking the entire subroutine as input to a Transformer or RNN-based encoder. But code behavior tends to depend on the flow from statement to statement. Normally dynamic analysis may shed light on this flow, but dynamic analysis on hundreds of thousands of examples in large datasets is not practical. In this paper, we present a statement-based memory encoder that learns the important elements of flow during training, leading to a statement-based subroutine representation without the need for dynamic analysis. We implement our encoder for code summarization and demonstrate a significant improvement over the state-of-the-art.Comment: 10 pages 2 figure
    • …
    corecore