52 research outputs found
srcMX: A GUI Application for srcML
srcMX is a GUI application that utilizes the srcML command-line tool to convert and display source code using the srcML format. The goal is for srcMX to promote the manipulation and exploration of source code using srcML. I also hope that the user-friendly nature inherent to GUI applications allows srcMX to introduce a larger audience to the many features offered by srcML. The application is written in C++ using the Qt and Qt Quick frameworks
Developing & Marketing a JavaScript Support Extension for the srcML Infrastructure
This work describes the development of a JavaScript grammar file for the srcML infrastructure\u27s future parser generator, the many files used to test it, and the marketing plan used to market the JavaScript Support extension to industry software developers and potential collaborators
A Study on Developer Perception of Transformation Languages for Refactoring
Although there is much research advancing state-of-art of program transformation tools, their application in industry source code change problems has not yet been gauged. In this context, the purpose of this paper is to better understand developer familiarity and comfort with these languages by conducting a survey. It poses, and answers, four research questions to understand how frequently source code transformation languages are applied to refactoring tasks, how well-known these languages are in industry, what developers think are obstacles to adoption, and what developer refactoring habits tell us about their current use, or underuse, of transformation languages. The results show that while source code transformation languages can fill a needed niche in refactoring, research must motivate their application. We provide explanations and insights based on data, aimed at the program transformation and refactoring communities, with a goal to motivate future research and ultimately improve industry adoption of transformation languages for refactoring tasks
Abstract Syntax Tree for Programming Language Understanding and Representation: How Far Are We?
Programming language understanding and representation (a.k.a code
representation learning) has always been a hot and challenging task in software
engineering. It aims to apply deep learning techniques to produce numerical
representations of the source code features while preserving its semantics.
These representations can be used for facilitating subsequent code-related
tasks. The abstract syntax tree (AST), a fundamental code feature, illustrates
the syntactic information of the source code and has been widely used in code
representation learning. However, there is still a lack of systematic and
quantitative evaluation of how well AST-based code representation facilitates
subsequent code-related tasks. In this paper, we first conduct a comprehensive
empirical study to explore the effectiveness of the AST-based code
representation in facilitating follow-up code-related tasks. To do so, we
compare the performance of models trained with code token sequence (Token for
short) based code representation and AST-based code representation on three
popular types of code-related tasks. Surprisingly, the overall quantitative
statistical results demonstrate that models trained with AST-based code
representation consistently perform worse across all three tasks compared to
models trained with Token-based code representation. Our further quantitative
analysis reveals that models trained with AST-based code representation
outperform models trained with Token-based code representation in certain
subsets of samples across all three tasks. We also conduct comprehensive
experiments to evaluate and reveal the impact of the choice of AST
parsing/preprocessing/encoding methods on AST-based code representation and
subsequent code-related tasks. Our study provides future researchers with
detailed guidance on how to select solutions at each stage to fully exploit
AST.Comment: submitted to ACM Transactions on Software Engineering and
Methodology. arXiv admin note: text overlap with arXiv:2103.10668 by other
author
MAGPIE: Machine Automated General Performance Improvement via Evolution of Software
Performance is one of the most important qualities of software. Several
techniques have thus been proposed to improve it, such as program
transformations, optimisation of software parameters, or compiler flags. Many
automated software improvement approaches use similar search strategies to
explore the space of possible improvements, yet available tooling only focuses
on one approach at a time. This makes comparisons and exploration of
interactions of the various types of improvement impractical.
We propose MAGPIE, a unified software improvement framework. It provides a
common edit sequence based representation that isolates the search process from
the specific improvement technique, enabling a much simplified synergistic
workflow. We provide a case study using a basic local search to compare
compiler optimisation, algorithm configuration, and genetic improvement. We
chose running time as our efficiency measure and evaluated our approach on four
real-world software, written in C, C++, and Java.
Our results show that, used independently, all techniques find significant
running time improvements: up to 25% for compiler optimisation, 97% for
algorithm configuration, and 61% for evolving source code using genetic
improvement. We also show that up to 10% further increase in performance can be
obtained with partial combinations of the variants found by the different
techniques. Furthermore, the common representation also enables simultaneous
exploration of all techniques, providing a competitive alternative to using
each technique individually.Comment: 19 page
Hierarchical learning of cross-language mappings through distributed vector representations for code
Ministry of Education, Singapore under its Academic Research Funding Tier
- …