Search CORE

445 research outputs found

Towards Automatic Generation of Short Summaries of Commits

Author: Jiang Siyuan
McMillan Collin
Publication venue
Publication date: 28/03/2017
Field of study

Committing to a version control system means submitting a software change to the system. Each commit can have a message to describe the submission. Several approaches have been proposed to automatically generate the content of such messages. However, the quality of the automatically generated messages falls far short of what humans write. In studying the differences between auto-generated and human-written messages, we found that 82% of the human-written messages have only one sentence, while the automatically generated messages often have multiple lines. Furthermore, we found that the commit messages often begin with a verb followed by an direct object. This finding inspired us to use a "verb+object" format in this paper to generate short commit summaries. We split the approach into two parts: verb generation and object generation. As our first try, we trained a classifier to classify a diff to a verb. We are seeking feedback from the community before we continue to work on generating direct objects for the commits.Comment: 4 pages, accepted in ICPC 2017 ERA Trac

arXiv.org e-Print Archive

Crossref

A Neural Model for Generating Natural Language Summaries of Program Subroutines

Author: Jiang Siyuan
LeClair Alexander
McMillan Collin
Publication venue
Publication date: 05/02/2019
Field of study

Source code summarization -- creating natural language descriptions of source code behavior -- is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance. Traditional techniques relied on heuristics and templates built manually by human experts. Recently, data-driven approaches based on neural machine translation have largely overtaken template-based systems. But nearly all of these techniques rely almost entirely on programs having good internal documentation; without clear identifier names, the models fail to create good summaries. In this paper, we present a neural model that combines words from code with code structure from an AST. Unlike previous approaches, our model processes each data source as a separate input, which allows the model to learn code structure independent of the text in code. This process helps our approach provide coherent summaries in many cases even when zero internal documentation is provided. We evaluate our technique with a dataset we created from 2.1m Java methods. We find improvement over two baseline techniques from SE literature and one from NLP literature

arXiv.org e-Print Archive

Crossref

Eastern Michigan University: Digital Commons@EMU

The current situation and management of idle rural homesteads in China - based on a survey in Jiangxi province

Author: Gan Zhongyang
Jiang Siyuan
Publication venue: RUS
Publication date: 01/01/2020
Field of study

Generally, China is still in the middle accelerating stage of urbanization. Rural idle homesteads are the main problems of rural areas in China, according to two elements (the population and land) can be divided into two types: the first is one household with houses, and the second is the population migration. Through the research questionnaire and interview analysis, the authors know that the traditional land concept is still deeply rooted among farmers. The phenomena of building houses only along the roads and the multi-story ostentation are prominent. The needs of traditional agricultural production have become a major obstacle to the management system of idle homesteads. The root cause for idle homesteads is the inevitable result of social and economic development, but also because the current law lags and inadequate management systems are not regulated properly, that is why it is becoming more and more serious. The authors suggest that the management system of idle homesteads should be divided into three steps based on villager autonomy: the first step is to promote the voluntary withdraw system of idle homesteads, the second step is to issue homesteads use right certificates, the third step is the classification of the ways of idle homesteads use

SSOAR - Social Science Open Access Repository

The Path and Enlightenment of Data-Driven Digital Transformation of Organizational Learning ——A Case Study of the Practice of China Telecom

Author: Jiang Guanghua
Xu Siyuan
Yu Wenhao
Publication venue: 'New Prairie Press'
Publication date: 01/06/2023
Field of study

This paper took China Telecom as a case. It has analyzed data-driven digital transformation in organizational learning, and summarized the methods and enlightenments of digital transformation

Kansas State University

Fabrication and characterizations of proton-exchanged LiNbO3 waveguides fabricated by inductively coupled plasma technique

Author: Hallam K. R.
Heard P. J.
Jiang Q.
Ren Z.
Varrazza R.
Wotherspoon Alex
Yu Siyuan
Publication venue: 'AIP Publishing'
Publication date: 01/04/2006
Field of study

This Letter reports the use of an inductively coupled plasma technique for fabrication of proton-exchanged (PE) LiNbO3 (LN) waveguides. Planar and stripe waveguides have been formed in Y-cut LN which are difficult to obtain with the conventional molten acid method due to the occurrence of surface damage. Secondary ion mass spectrometry, scanning electron microscopy, and infrared absorption spectrum characterization results revealed that a uniform vertical PE profile with a single low order crystal phase has been directly obtained as a result of this unique process. X-ray photoelectron spectroscopy characterization of the treated surface revealed the existence of NbO as the cause for a sometimes darkened surface and confirms the ability to completely restore the surface to LN by oxygen plasma treatment. Atomic force microscopy measurement confirms that good surface quality has been maintained after regeneration of the surface to LN

Crossref

Warwick Research Archives Portal Repository

Explore Bristol Research

Statement-based Memory for Neural Source Code Summarization

Author: Bansal Aakash
Haque Sakib
Jiang Siyuan
McMillan Collin
Publication venue
Publication date: 21/07/2023
Field of study

Source code summarization is the task of writing natural language descriptions of source code behavior. Code summarization underpins software documentation for programmers. Short descriptions of code help programmers understand the program quickly without having to read the code itself. Lately, neural source code summarization has emerged as the frontier of research into automated code summarization techniques. By far the most popular targets for summarization are program subroutines. The idea, in a nutshell, is to train an encoder-decoder neural architecture using large sets of examples of subroutines extracted from code repositories. The encoder represents the code and the decoder represents the summary. However, most current approaches attempt to treat the subroutine as a single unit. For example, by taking the entire subroutine as input to a Transformer or RNN-based encoder. But code behavior tends to depend on the flow from statement to statement. Normally dynamic analysis may shed light on this flow, but dynamic analysis on hundreds of thousands of examples in large datasets is not practical. In this paper, we present a statement-based memory encoder that learns the important elements of flow during training, leading to a statement-based subroutine representation without the need for dynamic analysis. We implement our encoder for code summarization and demonstrate a significant improvement over the state-of-the-art.Comment: 10 pages 2 figure

arXiv.org e-Print Archive