211 research outputs found
Cloud WorkBench - Infrastructure-as-Code Based Cloud Benchmarking
To optimally deploy their applications, users of Infrastructure-as-a-Service
clouds are required to evaluate the costs and performance of different
combinations of cloud configurations to find out which combination provides the
best service level for their specific application. Unfortunately, benchmarking
cloud services is cumbersome and error-prone. In this paper, we propose an
architecture and concrete implementation of a cloud benchmarking Web service,
which fosters the definition of reusable and representative benchmarks. In
distinction to existing work, our system is based on the notion of
Infrastructure-as-Code, which is a state of the art concept to define IT
infrastructure in a reproducible, well-defined, and testable way. We
demonstrate our system based on an illustrative case study, in which we measure
and compare the disk IO speeds of different instance and storage types in
Amazon EC2
User Review-Based Change File Localization for Mobile Applications
In the current mobile app development, novel and emerging DevOps practices
(e.g., Continuous Delivery, Integration, and user feedback analysis) and tools
are becoming more widespread. For instance, the integration of user feedback
(provided in the form of user reviews) in the software release cycle represents
a valuable asset for the maintenance and evolution of mobile apps. To fully
make use of these assets, it is highly desirable for developers to establish
semantic links between the user reviews and the software artefacts to be
changed (e.g., source code and documentation), and thus to localize the
potential files to change for addressing the user feedback. In this paper, we
propose RISING (Review Integration via claSsification, clusterIng, and
linkiNG), an automated approach to support the continuous integration of user
feedback via classification, clustering, and linking of user reviews. RISING
leverages domain-specific constraint information and semi-supervised learning
to group user reviews into multiple fine-grained clusters concerning similar
users' requests. Then, by combining the textual information from both commit
messages and source code, it automatically localizes potential change files to
accommodate the users' requests. Our empirical studies demonstrate that the
proposed approach outperforms the state-of-the-art baseline work in terms of
clustering and localization accuracy, and thus produces more reliable results.Comment: 15 pages, 3 figures, 8 table
07491 Abstracts Collection -- Mining Programs and Processes
From 02.12. to 17.12.2007, the Dagstuhl Seminar 07491 ``Mining Programs and Processes\u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Time variance and defect prediction in software projects: Towards an exploitation of periods of stability and change as well as a notion of concept drift in software projects
It is crucial for a software manager to know whether or not one can rely on a bug prediction model. A wrong prediction of the number or the location of future bugs can lead to problems in the achievement of a project's goals. In this paper we first verify the existence of variability in a bug prediction model's accuracy over time both visually and statistically. Furthermore, we explore the reasons for such a high variability over time, which includes periods of stability and variability of prediction quality, and formulate a decision procedure for evaluating prediction models before applying them. To exemplify our findings we use data from four open source projects and empirically identify various project features that influence the defect prediction quality. Specifically, we observed that a change in the number of authors editing a file and the number of defects fixed by them influence the prediction quality. Finally, we introduce an approach to estimate the accuracy of prediction models that helps a project manager decide when to rely on a prediction model. Our findings suggest that one should be aware of the periods of stability and variability of prediction quality and should use approaches such as ours to assess their models' accuracy in advanc
On-the-Fly Syntax Highlighting using Neural Networks
With the presence of online collaborative tools for software developers,
source code is shared and consulted frequently, from code viewers to merge
requests and code snippets. Typically, code highlighting quality in such
scenarios is sacrificed in favor of system responsiveness. In these on-the-fly
settings, performing a formal grammatical analysis of the source code is not
only expensive, but also intractable for the many times the input is an invalid
derivation of the language. Indeed, current popular highlighters heavily rely
on a system of regular expressions, typically far from the specification of the
language's lexer. Due to their complexity, regular expressions need to be
periodically updated as more feedback is collected from the users and their
design unwelcome the detection of more complex language formations. This paper
delivers a deep learning-based approach suitable for on-the-fly grammatical
code highlighting of correct and incorrect language derivations, such as code
viewers and snippets. It focuses on alleviating the burden on the developers,
who can reuse the language's parsing strategy to produce the desired
highlighting specification. Moreover, this approach is compared to nowadays
online syntax highlighting tools and formal methods in terms of accuracy and
execution time, across different levels of grammatical coverage, for three
mainstream programming languages. The results obtained show how the proposed
approach can consistently achieve near-perfect accuracy in its predictions,
thereby outperforming regular expression-based strategies.Comment: Accepted for publication in the ACM Joint European Software
Engineering Conference and Symposium on the Foundations of Software
Engineering (ESEC/FSE 2022
A framework for semi-automated software evolution analysis composition
Software evolution data stored in repositories such as version control, bug and issue tracking, or mailing lists is crucial to better understand a software system and assess its quality. A myriad of analyses exploiting such data have been proposed throughout the years. However, easy and straight forward synergies between these analyses rarely exist. To tackle this problem we have investigated the concept of Software Analysis as a Service and devised SOFAS, a distributed and collaborative software evolution analysis platform. Software analyses are offered as services that can be accessed, composed into workflows, and executed over the Internet. This paper presents our framework for composing these analyses into workflows, consisting of a custom-made modeling language and a composition infrastructure for the service offerings. The framework exploits the RESTful nature of our analysis service architecture and comes with a service composer to enable semi-automated service compositions by a user. We validate our framework by showcasing two different approaches built on top of it that support different stakeholders in gaining a deeper insight into a project history and evolution. As a result, our framework has shown its applicability to deliver diverse, complex analyses across system and tool boundarie
Discovering Loners and Phantoms in Commit and Issue Data
The interlinking of commit and issue data has become a de-facto standard in software development. Modern issue tracking systems, such as JIRA, automatically interlink commits and issues by the extraction of identifiers (e.g., issue key) from commit messages. However, the conventions for the use of interlinking methodologies vary between software projects. For example, some projects enforce the use of identifiers for every commit while others have less restrictive conventions. In this work, we introduce a model called PaLiMod to enable the analysis of interlinking characteristics in commit and issue data. We surveyed 15 Apache projects to investigate differences and commonalities between linked and non-linked commits and issues. Based on the gathered information, we created a set of heuristics to interlink the residual of non-linked commits and issues. We present the characteristics of Loners and Phantoms in commit and issue data. The results of our evaluation indicate that the proposed PaLiMod model and heuristics enable an automatic interlinking and can indeed reduce the residual of non-linked commits and issues in software projects
- …