32,400 research outputs found
An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation
Unit tests play a key role in ensuring the correctness of software. However,
manually creating unit tests is a laborious task, motivating the need for
automation. Large Language Models (LLMs) have recently been applied to this
problem, utilizing additional training or few-shot learning on examples of
existing tests. This paper presents a large-scale empirical evaluation on the
effectiveness of LLMs for automated unit test generation without additional
training or manual effort, providing the LLM with the signature and
implementation of the function under test, along with usage examples extracted
from documentation. We also attempt to repair failed generated tests by
re-prompting the model with the failing test and error message. We implement
our approach in TestPilot, a test generation tool for JavaScript that
automatically generates unit tests for all API functions in an npm package. We
evaluate TestPilot using OpenAI's gpt3.5-turbo LLM on 25 npm packages with a
total of 1,684 API functions. The generated tests achieve a median statement
coverage of 70.2% and branch coverage of 52.8%, significantly improving on
Nessie, a recent feedback-directed JavaScript test generation technique, which
achieves only 51.3% statement coverage and 25.6% branch coverage. We also find
that 92.8% of TestPilot's generated tests have no more than 50% similarity with
existing tests (as measured by normalized edit distance), with none of them
being exact copies. Finally, we run TestPilot with two additional LLMs,
OpenAI's older code-cushman-002 LLM and the open LLM StarCoder. Overall, we
observed similar results with the former (68.2% median statement coverage), and
somewhat worse results with the latter (54.0% median statement coverage),
suggesting that the effectiveness of the approach is influenced by the size and
training set of the LLM, but does not fundamentally depend on the specific
model
Inviwo -- A Visualization System with Usage Abstraction Levels
The complexity of today's visualization applications demands specific
visualization systems tailored for the development of these applications.
Frequently, such systems utilize levels of abstraction to improve the
application development process, for instance by providing a data flow network
editor. Unfortunately, these abstractions result in several issues, which need
to be circumvented through an abstraction-centered system design. Often, a high
level of abstraction hides low level details, which makes it difficult to
directly access the underlying computing platform, which would be important to
achieve an optimal performance. Therefore, we propose a layer structure
developed for modern and sustainable visualization systems allowing developers
to interact with all contained abstraction levels. We refer to this interaction
capabilities as usage abstraction levels, since we target application
developers with various levels of experience. We formulate the requirements for
such a system, derive the desired architecture, and present how the concepts
have been exemplary realized within the Inviwo visualization system.
Furthermore, we address several specific challenges that arise during the
realization of such a layered architecture, such as communication between
different computing platforms, performance centered encapsulation, as well as
layer-independent development by supporting cross layer documentation and
debugging capabilities
J-PET Framework: Software platform for PET tomography data reconstruction and analysis
J-PET Framework is an open-source software platform for data analysis,
written in C++ and based on the ROOT package. It provides a common environment
for implementation of reconstruction, calibration and filtering procedures, as
well as for user-level analyses of Positron Emission Tomography data. The
library contains a set of building blocks that can be combined by users with
even little programming experience, into chains of processing tasks through a
convenient, simple and well-documented API. The generic input-output interface
allows processing the data from various sources: low-level data from the
tomography acquisition system or from diagnostic setups such as digital
oscilloscopes, as well as high-level tomography structures e.g. sinograms or a
list of lines-of-response. Moreover, the environment can be interfaced with
Monte Carlo simulation packages such as GEANT and GATE, which are commonly used
in the medical scientific community.Comment: 14 pages, 5 figure
Extensible Component Based Architecture for FLASH, A Massively Parallel, Multiphysics Simulation Code
FLASH is a publicly available high performance application code which has
evolved into a modular, extensible software system from a collection of
unconnected legacy codes. FLASH has been successful because its capabilities
have been driven by the needs of scientific applications, without compromising
maintainability, performance, and usability. In its newest incarnation, FLASH3
consists of inter-operable modules that can be combined to generate different
applications. The FLASH architecture allows arbitrarily many alternative
implementations of its components to co-exist and interchange with each other,
resulting in greater flexibility. Further, a simple and elegant mechanism
exists for customization of code functionality without the need to modify the
core implementation of the source. A built-in unit test framework providing
verifiability, combined with a rigorous software maintenance process, allow the
code to operate simultaneously in the dual mode of production and development.
In this paper we describe the FLASH3 architecture, with emphasis on solutions
to the more challenging conflicts arising from solver complexity, portable
performance requirements, and legacy codes. We also include results from user
surveys conducted in 2005 and 2007, which highlight the success of the code.Comment: 33 pages, 7 figures; revised paper submitted to Parallel Computin
Designing and evaluating the usability of a machine learning API for rapid prototyping music technology
To better support creative software developers and music technologists' needs, and to empower them as machine learning users and innovators, the usability of and developer experience with machine learning tools must be considered and better understood. We review background research on the design and evaluation of application programming interfaces (APIs), with a focus on the domain of machine learning for music technology software development. We present the design rationale for the RAPID-MIX API, an easy-to-use API for rapid prototyping with interactive machine learning, and a usability evaluation study with software developers of music technology. A cognitive dimensions questionnaire was designed and delivered to a group of 12 participants who used the RAPID-MIX API in their software projects, including people who developed systems for personal use and professionals developing software products for music and creative technology companies. The results from the questionnaire indicate that participants found the RAPID-MIX API a machine learning API which is easy to learn and use, fun, and good for rapid prototyping with interactive machine learning. Based on these findings, we present an analysis and characterization of the RAPID-MIX API based on the cognitive dimensions framework, and discuss its design trade-offs and usability issues. We use these insights and our design experience to provide design recommendations for ML APIs for rapid prototyping of music technology. We conclude with a summary of the main insights, a discussion of the merits and challenges of the application of the CDs framework to the evaluation of machine learning APIs, and directions to future work which our research deems valuable
Seglearn: A Python Package for Learning Sequences and Time Series
Seglearn is an open-source python package for machine learning time series or
sequences using a sliding window segmentation approach. The implementation
provides a flexible pipeline for tackling classification, regression, and
forecasting problems with multivariate sequence and contextual data. This
package is compatible with scikit-learn and is listed under scikit-learn
Related Projects. The package depends on numpy, scipy, and scikit-learn.
Seglearn is distributed under the BSD 3-Clause License. Documentation includes
a detailed API description, user guide, and examples. Unit tests provide a high
degree of code coverage
- …