256 research outputs found
Human-Centric Program Synthesis
Program synthesis techniques offer significant new capabilities in searching for programs that satisfy high-level specifications. While synthesis has been thoroughly explored for input/output pair specifications (programming-by-example), this paper asks: what does program synthesis look like beyond examples? What actual issues in day-to-day development would stand to benefit the most from synthesis? How can a human-centric perspective inform the exploration of alternative specification languages for synthesis? I sketch a human-centric vision for program synthesis where programmers explore and learn languages and APIs aided by a synthesis tool
Recommended from our members
Refactoring traces to identify concurrency improvements
It is often difficult to analyse why a program executes more slowly than intended. This is particularly true for concurrent programs. We describe and evaluate a system, Rehype, which takes Java programs, performs low-overhead tracing of method calls, analyses the resulting trace-logs to detect inefficient uses of concurrency constructs, and suggests source-code-oriented improvements. Rehype deals with task-based concurrency, specifically a future-based model of tasks. Implementing the suggested improvements on an industrial API server more than doubled request-processing throughput.The first author was funded by the Engineering and Physical Sciences Research Council (EPSRC), the Cambridge Trusts, and the University of Cambridge Department of Computer Science and Technology
Practical heuristics to improve precision for erroneous function argument swapping detection in C and C++
Argument selection defects, in which the programmer chooses the wrong argument to pass to a parameter from a potential set of arguments in a function call, is a widely investigated problem. The compiler can detect such misuse of arguments only through the argument and parameter type for statically typed programming languages. When adjacent parameters have the same type or can be converted between one another, a swapped or out of order call will not be diagnosed by compilers. Related research is usually confined to exact type equivalence, often ignoring potential implicit or explicit conversions. However, in current mainstream languages, like C++, built-in conversions between numerics and user-defined conversions may significantly increase the number of mistakes to go unnoticed. We investigated the situation for C and C++ languages where developers can define functions with multiple adjacent parameters that allow arguments to pass in the wrong order. When implicit conversions – such as parameter pairs of types ``(int, bool)`` – are taken into account, the number of mistake-prone functions markedly increases compared to only strict type equivalence. We analysed a sample of projects and categorised the offending parameter types.
The empirical results should further encourage the language and library development community to emphasise the importance of strong typing and to restrict the proliferation of implicit conversions. However, the analysis produces a hard to consume amount of diagnostics for existing projects, and there are always cases that match the analysis rule but cannot be “fixed”. As such, further heuristics are needed to allow developers to refactor effectively based on the analysis results. We devised such heuristics, measured their expressive power, and found that several simple heuristics greatly help highlight the more problematic cases
Energyware analysis
This documents introduces \Energyware" as a software engineering discipline aiming at defining, analyzing and optimizing the energy consumption by software systems. In this paper we present energyware analysis in the context of programming languages, software data structures and program's source code. For each of these areas we describe the research work done in the context of the Green Software Laboratory at Minho University: we describe energyaware techniques, tools, libraries, and repositories.This work is financed by the ERDF European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the Portuguese funding agency, FCT - Fundao para a Ciłncia e a Tecnologia within project POCI-01-0145-FEDER-016718 and UID/EEA/50014/2013. The first author is also sponsored by FCT grant SFRH/BD/112733/2015
UNGOML: Automated Classification of unsafe Usages in Go
The Go programming language offers strong protection from memory corruption.
As an escape hatch of these protections, it provides the unsafe package.
Previous studies identified that this unsafe package is frequently used in
real-world code for several purposes, e.g., serialization or casting types. Due
to the variety of these reasons, it may be possible to refactor specific usages
to avoid potential vulnerabilities. However, the classification of unsafe
usages is challenging and requires the context of the call and the program's
structure. In this paper, we present the first automated classifier for unsafe
usages in Go, UNGOML, to identify what is done with the unsafe package and why
it is used. For UNGOML, we built four custom deep learning classifiers trained
on a manually labeled data set. We represent Go code as enriched control-flow
graphs (CFGs) and solve the label prediction task with one single-vertex and
three context-aware classifiers. All three context-aware classifiers achieve a
top-1 accuracy of more than 86% for both dimensions, WHAT and WHY. Furthermore,
in a set-valued conformal prediction setting, we achieve accuracies of more
than 93% with mean label set sizes of 2 for both dimensions. Thus, UNGOML can
be used to efficiently filter unsafe usages for use cases such as refactoring
or a security audit. UNGOML: https://github.com/stg-tud/ungoml Artifact:
https://dx.doi.org/10.6084/m9.figshare.22293052Comment: 13 pages, accepted at the 2023 IEEE/ACM 20th International Conference
on Mining Software Repositories (MSR 2023
Unleashing the Power of Clippy in Real-World Rust Projects
Clippy lints are considered as essential tools for Rust developers, as they
can be configured as gate-keeping rules for a Rust project during continuous
integration. Despite their availability, little was known about practical
application and cost-effectiveness of the lints in reducing code quality
issues. In this study, we embark on a comprehensive analysis to unveil the true
impact of Clippy lints in the Rust development landscape. The study is
structured around three interrelated components, each contributing to the
overall effectiveness of Clippy. Firstly, we conduct a comprehensive analysis
of Clippy lints in all idiomatic crates-io Rust projects with an average
warning density of 21/KLOC. The analysis identifies the most cost-effective
lint fixes, offering valuable opportunities for optimizing code quality.
Secondly, we actively engage Rust developers through a user survey to garner
invaluable feedback on their experiences with Clippy. User insights shed light
on two crucial concerns: the prevalence of false positives in warnings and the
need for auto-fix support for most warnings. Thirdly, building upon these
findings, we engineer three innovative automated refactoring techniques to
effectively fix the four most frequent Clippy lints. As a result, the warning
density in Rosetta benchmarks has significantly decreased from 195/KLOC to an
impressive 18/KLOC, already lower than the average density of the crates-io
Rust projects. These results demonstrate tangible benefit and impact of our
efforts in enhancing the overall code quality and maintainability for Rust
developers
Exploring Automated Code Evaluation Systems and Resources for Code Analysis: A Comprehensive Survey
The automated code evaluation system (AES) is mainly designed to reliably
assess user-submitted code. Due to their extensive range of applications and
the accumulation of valuable resources, AESs are becoming increasingly popular.
Research on the application of AES and their real-world resource exploration
for diverse coding tasks is still lacking. In this study, we conducted a
comprehensive survey on AESs and their resources. This survey explores the
application areas of AESs, available resources, and resource utilization for
coding tasks. AESs are categorized into programming contests, programming
learning and education, recruitment, online compilers, and additional modules,
depending on their application. We explore the available datasets and other
resources of these systems for research, analysis, and coding tasks. Moreover,
we provide an overview of machine learning-driven coding tasks, such as bug
detection, code review, comprehension, refactoring, search, representation, and
repair. These tasks are performed using real-life datasets. In addition, we
briefly discuss the Aizu Online Judge platform as a real example of an AES from
the perspectives of system design (hardware and software), operation
(competition and education), and research. This is due to the scalability of
the AOJ platform (programming education, competitions, and practice), open
internal features (hardware and software), attention from the research
community, open source data (e.g., solution codes and submission documents),
and transparency. We also analyze the overall performance of this system and
the perceived challenges over the years
- …