20,120 research outputs found
Report on the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)
This report records and discusses the Third Workshop on Sustainable Software
for Science: Practice and Experiences (WSSSPE3). The report includes a
description of the keynote presentation of the workshop, which served as an
overview of sustainable scientific software. It also summarizes a set of
lightning talks in which speakers highlighted to-the-point lessons and
challenges pertaining to sustaining scientific software. The final and main
contribution of the report is a summary of the discussions, future steps, and
future organization for a set of self-organized working groups on topics
including developing pathways to funding scientific software; constructing
useful common metrics for crediting software stakeholders; identifying
principles for sustainable software engineering design; reaching out to
research software organizations around the world; and building communities for
software sustainability. For each group, we include a point of contact and a
landing page that can be used by those who want to join that group's future
activities. The main challenge left by the workshop is to see if the groups
will execute these activities that they have scheduled, and how the WSSSPE
community can encourage this to happen
Connecting Software Metrics across Versions to Predict Defects
Accurate software defect prediction could help software practitioners
allocate test resources to defect-prone modules effectively and efficiently. In
the last decades, much effort has been devoted to build accurate defect
prediction models, including developing quality defect predictors and modeling
techniques. However, current widely used defect predictors such as code metrics
and process metrics could not well describe how software modules change over
the project evolution, which we believe is important for defect prediction. In
order to deal with this problem, in this paper, we propose to use the
Historical Version Sequence of Metrics (HVSM) in continuous software versions
as defect predictors. Furthermore, we leverage Recurrent Neural Network (RNN),
a popular modeling technique, to take HVSM as the input to build software
prediction models. The experimental results show that, in most cases, the
proposed HVSM-based RNN model has a significantly better effort-aware ranking
effectiveness than the commonly used baseline models
Developing an h-index for OSS developers
The public data available in Open Source Software (OSS) repositories has been used for many practical reasons: detecting community structures; identifying key roles among developers; understanding software quality; predicting the arousal of bugs in large OSS systems, and so on; but also to formulate and validate new metrics and proof-of-concepts on general, non-OSS specific, software engineering aspects. One of the results that has not emerged yet from the analysis of OSS repositories is how to help the âcareer advancementâ of developers: given the available data on products and processes used in OSS development, it should be possible to produce measurements to identify and describe a developer, that could be used externally as a measure of recognition and experience. This paper builds on top of the h-index, used in academic contexts, and which is used to determine the recognition of a researcher among her peers. By creating similar indices for OSS (or any) developers, this work could help defining a baseline for measuring and comparing the contributions of OSS developers in an objective, open and reproducible way
git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories
Data from software repositories have become an important foundation for the
empirical study of software engineering processes. A recurring theme in the
repository mining literature is the inference of developer networks capturing
e.g. collaboration, coordination, or communication from the commit history of
projects. Most of the studied networks are based on the co-authorship of
software artefacts defined at the level of files, modules, or packages. While
this approach has led to insights into the social aspects of software
development, it neglects detailed information on code changes and code
ownership, e.g. which exact lines of code have been authored by which
developers, that is contained in the commit log of software projects.
Addressing this issue, we introduce git2net, a scalable python software that
facilitates the extraction of fine-grained co-editing networks in large git
repositories. It uses text mining techniques to analyse the detailed history of
textual modifications within files. This information allows us to construct
directed, weighted, and time-stamped networks, where a link signifies that one
developer has edited a block of source code originally written by another
developer. Our tool is applied in case studies of an Open Source and a
commercial software project. We argue that it opens up a massive new source of
high-resolution data on human collaboration patterns.Comment: MSR 2019, 12 pages, 10 figure
Understanding spatial data usability
In recent geographical information science literature, a number of researchers have made passing reference to an apparently new characteristic of spatial data known as 'usability'. While this attribute is well-known to professionals engaged in software engineering and computer interface design and testing, extension of the concept to embrace information would seem to be a new development. Furthermore, while notions such as the use and value of spatial information, and the diffusion of spatial information systems, have been the subject of research since the late-1980s, the current references to usability clearly represent something which extends well beyond that initial research. Accordingly, the purposes of this paper are: (1) to understand what is meant by spatial data usability; (2) to identify the elements that might comprise usability; and (3) to consider what the related research questions might be
- âŠ