Search CORE

33 research outputs found

Ontological Engineering For Source Code Generation

Author: aloklah anas hamid
aref mostafa mohamed, prof
gad walaa
salem abd el-badea Mohamed, prof
Publication venue: Arab Journals Platform
Publication date: 29/09/2020
Field of study

Source Code Generation (SCG) is the sub-domain of the Automatic Programming (AP) that helps programmers to program using high-level abstraction. Recently, many researchers investigated many techniques to access SCG. The problem is to use the appropriate technique to generate the source code due to its purposes and the inputs. This paper introduces a review and an analysis related SCG techniques. Moreover, comparisons are presented for: techniques mapping, Natural Language Processing (NLP), knowledge base, ontology, Specification Configuration Template (SCT) model and deep learnin

Arab Journals Platform

Semantic Source Code Models Using Identifier Embeddings

Author: Efstathiou Vasiliki
Spinellis Diomidis
Publication venue
Publication date: 15/04/2019
Field of study

The emergence of online open source repositories in the recent years has led to an explosion in the volume of openly available source code, coupled with metadata that relate to a variety of software development activities. As an effect, in line with recent advances in machine learning research, software maintenance activities are switching from symbolic formal methods to data-driven methods. In this context, the rich semantics hidden in source code identifiers provide opportunities for building semantic representations of code which can assist tasks of code search and reuse. To this end, we deliver in the form of pretrained vector space models, distributed code representations for six popular programming languages, namely, Java, Python, PHP, C, C++, and C#. The models are produced using fastText, a state-of-the-art library for learning word representations. Each model is trained on data from a single programming language; the code mined for producing all models amounts to over 13.000 repositories. We indicate dissimilarities between natural language and source code, as well as variations in coding conventions in between the different programming languages we processed. We describe how these heterogeneities guided the data preprocessing decisions we took and the selection of the training parameters in the released models. Finally, we propose potential applications of the models and discuss limitations of the models.Comment: 16th International Conference on Mining Software Repositories (MSR 2019): Data Showcase Trac

arXiv.org e-Print Archive

Crossref

Vary: An IDE for Designing Algorithms and Measuring Quality

Author: Dodero Juan Manuel
Hurtado Nuria
Mota José Miguel
Person Tatiana
Ruiz-Rube Iván
Silva-Ramírez Esther-Lydia
Publication venue: AIS Electronic Library (AISeL)
Publication date: 02/10/2018
Field of study

Pseudocode is one of the recommended methods for teaching students to design algorithms. Having a tool that performs the automatic translation of an algorithm into pseudocode to a programming language would allow the student to understand the complete process of program development. In addition, the introduction of quality measurement of algorithms designed from the first steps of learning programming would enable the student to understand the importance of code quality for maintenance of software processes. This work describes Vary, an integrated development environment based on Eclipse for writing and running pseudocode algorithms. The environment automatically transforms abstract pseudocode into runnable C/C++ source code that can be later executed. Computer programming learners and even computational scientists can use Vary to write and run algorithms, while taking advantage of modern development environment features. Vary is provided with an additional extension to automatically carry out algorithm analysis with SonarQube

AIS Electronic Library (AISeL)

A Neural Model for Generating Natural Language Summaries of Program Subroutines

Author: Jiang Siyuan
LeClair Alexander
McMillan Collin
Publication venue
Publication date: 05/02/2019
Field of study

Source code summarization -- creating natural language descriptions of source code behavior -- is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance. Traditional techniques relied on heuristics and templates built manually by human experts. Recently, data-driven approaches based on neural machine translation have largely overtaken template-based systems. But nearly all of these techniques rely almost entirely on programs having good internal documentation; without clear identifier names, the models fail to create good summaries. In this paper, we present a neural model that combines words from code with code structure from an AST. Unlike previous approaches, our model processes each data source as a separate input, which allows the model to learn code structure independent of the text in code. This process helps our approach provide coherent summaries in many cases even when zero internal documentation is provided. We evaluate our technique with a dataset we created from 2.1m Java methods. We find improvement over two baseline techniques from SE literature and one from NLP literature

arXiv.org e-Print Archive

Crossref

Eastern Michigan University: Digital Commons@EMU