3,734 research outputs found
Software Engineers' Information Seeking Behavior in Change Impact Analysis - An Interview Study
Software engineers working in large projects must navigate complex
information landscapes. Change Impact Analysis (CIA) is a task that relies on
engineers' successful information seeking in databases storing, e.g., source
code, requirements, design descriptions, and test case specifications. Several
previous approaches to support information seeking are task-specific, thus
understanding engineers' seeking behavior in specific tasks is fundamental. We
present an industrial case study on how engineers seek information in CIA, with
a particular focus on traceability and development artifacts that are not
source code. We show that engineers have different information seeking
behavior, and that some do not consider traceability particularly useful when
conducting CIA. Furthermore, we observe a tendency for engineers to prefer less
rigid types of support rather than formal approaches, i.e., engineers value
support that allows flexibility in how to practically conduct CIA. Finally, due
to diverse information seeking behavior, we argue that future CIA support
should embrace individual preferences to identify change impact by empowering
several seeking alternatives, including searching, browsing, and tracing.Comment: Accepted for publication in the proceedings of the 25th International
Conference on Program Comprehensio
Recommended from our members
Open Science principles for accelerating trait-based science across the Tree of Life.
Synthesizing trait observations and knowledge across the Tree of Life remains a grand challenge for biodiversity science. Species traits are widely used in ecological and evolutionary science, and new data and methods have proliferated rapidly. Yet accessing and integrating disparate data sources remains a considerable challenge, slowing progress toward a global synthesis to integrate trait data across organisms. Trait science needs a vision for achieving global integration across all organisms. Here, we outline how the adoption of key Open Science principles-open data, open source and open methods-is transforming trait science, increasing transparency, democratizing access and accelerating global synthesis. To enhance widespread adoption of these principles, we introduce the Open Traits Network (OTN), a global, decentralized community welcoming all researchers and institutions pursuing the collaborative goal of standardizing and integrating trait data across organisms. We demonstrate how adherence to Open Science principles is key to the OTN community and outline five activities that can accelerate the synthesis of trait data across the Tree of Life, thereby facilitating rapid advances to address scientific inquiries and environmental issues. Lessons learned along the path to a global synthesis of trait data will provide a framework for addressing similarly complex data science and informatics challenges
Can Refactoring be Self-Affirmed? An Exploratory Study on How Developers Document their Refactoring Activities in Commit Messages
Refactoring is a critical task in software maintenance and is usually performed to enforce best design practices, or to cope with design defects. Previous studies heavily rely on defining a set of keywords to identify refactoring commits from a list of general commits extracted from a small set of softwaresystems. All approaches thus far consider all commits without checking whether refactorings had actually happened or not. In this paper, we aim at exploring how developers document their refactoring activities during the software life cycle. We call such activity Self-Affirmed Refactoring, which is an indication ofthe developer-related refactoring events in the commit messages. Our approach relies on text mining refactoring-related change messages and identifying refactoring patterns by only consideringrefactoring commits. We found that (1) developers use a variety of patterns to purposefully target refactoring-related activities; (2) developers tend to explicitly mention the improvement of specific quality attributes and code smells; and (3) commit messages withself-affirmed refactoring patterns tend to have more significant refactoring activit
Deep Learning Software Repositories
Bridging the abstraction gap between artifacts and concepts is the essence of software engineering (SE) research problems. SE researchers regularly use machine learning to bridge this gap, but there are three fundamental issues with traditional applications of machine learning in SE research. Traditional applications are too reliant on labeled data. They are too reliant on human intuition, and they are not capable of learning expressive yet efficient internal representations. Ultimately, SE research needs approaches that can automatically learn representations of massive, heterogeneous, datasets in situ, apply the learned features to a particular task and possibly transfer knowledge from task to task. Improvements in both computational power and the amount of memory in modern computer architectures have enabled new approaches to canonical machine learning tasks. Specifically, these architectural advances have enabled machines that are capable of learning deep, compositional representations of massive data depots. The rise of deep learning has ushered in tremendous advances in several fields. Given the complexity of software repositories, we presume deep learning has the potential to usher in new analytical frameworks and methodologies for SE research and the practical applications it reaches. This dissertation examines and enables deep learning algorithms in different SE contexts. We demonstrate that deep learners significantly outperform state-of-the-practice software language models at code suggestion on a Java corpus. Further, these deep learners for code suggestion automatically learn how to represent lexical elements. We use these representations to transmute source code into structures for detecting similar code fragments at different levels of granularity—without declaring features for how the source code is to be represented. Then we use our learning-based framework for encoding fragments to intelligently select and adapt statements in a codebase for automated program repair. In our work on code suggestion, code clone detection, and automated program repair, everything for representing lexical elements and code fragments is mined from the source code repository. Indeed, our work aims to move SE research from the art of feature engineering to the science of automated discovery
scikit-fda: A Python Package for Functional Data Analysis
The library scikit-fda is a Python package for Functional Data Analysis
(FDA). It provides a comprehensive set of tools for representation,
preprocessing, and exploratory analysis of functional data. The library is
built upon and integrated in Python's scientific ecosystem. In particular, it
conforms to the scikit-learn application programming interface so as to take
advantage of the functionality for machine learning provided by this package:
pipelines, model selection, and hyperparameter tuning, among others. The
scikit-fda package has been released as free and open-source software under a
3-Clause BSD license and is open to contributions from the FDA community. The
library's extensive documentation includes step-by-step tutorials and detailed
examples of use
- …