245 research outputs found
Towards an automation of the traceability of bugs from development logs: A study based on open source software
Context: Information and tracking of defects can be severely incomplete in almost every Open Source project, resulting in a reduced traceability of defects into the development logs (i.e., version control commit logs). In particular, defect data often appears not in sync when considering what developers logged as their actions. Synchronizing or completing the missing data of the bug repositories, with the logs detailing the actions of developers, would benefit various branches of empirical software engineering research: prediction of software faults, software reliability, traceability, software quality, effort and cost estimation, bug prediction and bug fixing.
Objective: To design a framework that automates the process of synchronizing and filling the gaps of the development logs and bug issue data for open source software projects.
Method: We instantiate the framework with a sample of OSS projects from GitHub, and by parsing, linking and filling the gaps found in their bug issue data, and development logs. UML diagrams show the relevant modules that will be used to merge, link and connect the bug issue data with the development data.
Results: Analysing a sample of over 300 OSS projects we observed that around 1/2 of bug-related data is present in either development logs or issue tracker logs: the rest of the data is missing from one or the other source. We designed an automated approach that fills the gaps of either source by making use of the available data, and we successfully mapped all the missing data of the analysed projects, when using one heuristics of annotating bugs. Other heuristics need to be investigated and implemented.
Conclusion: In this paper a framework to synchronise the development logs and bug data used in empirical software engineering was designed to automatically fill the missing parts of development logs and bugs of issue data
Discovering Activities in Software Development Processes
Software development processes are complex to monitor as they involve the coordination of manyresources working with different tools. This makes it hard to apply mining techniques for monitoringthe process. A key challenge for using traces of tools such as version control systems (VCS) is to findmeaningful abstractions in order to identify the work that was actually done. In this paper, we use datafrom VCS to analyze the actual progress of software-development processes. We develop a technique that is able to mine the activity types of which the development processes consists. We implement our technique as a prototype in Java and evaluate its outputs in terms of effectiveness. In this way, we are able to graphically uncover new behavioural patterns in real-world data from existing open-source GitHub repositories
Beyond Surveys: Analyzing Software Development Artifacts to Assess Teaching Efforts
This Innovative Practice Full Paper presents an approach of using software
development artifacts to gauge student behavior and the effectiveness of
changes to curriculum design. There is an ongoing need to adapt university
courses to changing requirements and shifts in industry. As an educator it is
therefore vital to have access to methods, with which to ascertain the effects
of curriculum design changes. In this paper, we present our approach of
analyzing software repositories in order to gauge student behavior during
project work. We evaluate this approach in a case study of a university
undergraduate software development course teaching agile development
methodologies. Surveys revealed positive attitudes towards the course and the
change of employed development methodology from Scrum to Kanban. However,
surveys were not usable to ascertain the degree to which students had adapted
their workflows and whether they had done so in accordance with course goals.
Therefore, we analyzed students' software repository data, which represents
information that can be collected by educators to reveal insights into learning
successes and detailed student behavior. We analyze the software repositories
created during the last five courses, and evaluate differences in workflows
between Kanban and Scrum usage
Recommended from our members
Towards an Automation of the Traceability of Bugs from Development Logs
Context: Information and tracking of defects can be severely incomplete in almost every Open Source project, resulting in a reduced traceability of defects into the development logs (i.e., version control commit logs). In particular, defect data often appears not in sync when considering what developers logged as their actions. Synchronizing or completing the missing data of the bug repositories, with the logs detailing the actions of developers, would benefit various branches of empirical software engineering research: prediction of software faults, software reliability, traceability, software quality, effort and cost estimation, bug prediction and bug fixing. Objective: To design a framework that automates the process of synchronizing and filling the gaps of the development logs and bug issue data for open source software projects. Method: We instantiate the framework with a sample of OSS projects from GitHub, and by parsing, linking and filling the gaps found in their bug issue data, and development logs. UML diagrams show the relevant modules that will be used to merge, link and connect the bug issue data with the development data. Results: Analysing a sample of over 300 OSS projects we observed that around 1/2 of bug-related data is present in either development logs or issue tracker logs: the rest of the data is missing from one or the other source. We designed an automated approach that fills the gaps of either source by making use of the available data, and we successfully mapped all the missing data of the analysed projects, when using one heuristics of annotating bugs. Other heuristics need to be investigated and implemented. Conclusion: In this paper a framework to synchronise the development logs and bug data used in empirical software engineering was designed to automatically fill the missing parts of development logs and bugs of issue data
Towards using fluctuations in internal quality metrics to find design intents
Le contrôle de version est la pierre angulaire des processus de développement de logiciels modernes. Tout en
construisant des logiciels de plus en plus complexes, les développeurs doivent comprendre des sous-systèmes de code
source qui leur sont peu familier. Alors que la compréhension de la logique d'un code étranger est relativement simple,
la compréhension de sa conception et de sa genèse est plus compliquée. Elle n'est souvent possible que par les
descriptions des révisions et de la documentation du projet qui sont dispersées et peu fiables -- quand elles existent.
Ainsi, les développeurs ont besoin d'une base de référence fiable et pertinente pour comprendre l'historique des projets
logiciels. Dans cette thèse, nous faisons les premiers pas vers la compréhension des motifs de changement dans les
historiques de révision. Nous étudions les changements prenant place dans les métriques logicielles durant l'évolution
d'un projet.
Au travers de multiples études exploratoires, nous réalisons des expériences quantitatives et qualitatives sur plusieurs
jeux de données extraits à partir d'un ensemble de 13 projets. Nous extrayons les changements dans les métriques
logicielles de chaque commit et construisons un jeu de donnée annoté manuellement comme vérité de base.
Nous avons identifié plusieurs catégories en analysant ces changements. Un motif en particulier nommé "compromis", dans
lequel certaines métriques peuvent s'améliorer au détriment d'autres, s'est avéré être un indicateur prometteur de
changements liés à la conception -- dans certains cas, il laisse également entrevoir une intention de conception
consciente de la part des auteurs des changements. Pour démontrer les observations de nos études exploratoires, nous
construisons un modèle général pour identifier l'application d'un ensemble bien connu de principes de conception dans de
nouveaux projets.
Nos résultats suggèrent que les fluctuations de métriques ont le potentiel d'être des indicateurs pertinents pour gagner
des aperçus macroscopiques sur l'évolution de la conception dans l'historique de développement d'un projet.Version control is the backbone of the modern software development workflow. While building more and more complex
systems, developers have to understand unfamiliar subsystems of source code. Understanding the logic of unfamiliar code
is relatively straightforward. However, understanding its design and its genesis is often only possible through
scattered and unreliable commit messages and project documentation -- when they exist.
Thus, developers need a reliable and relevant baseline to understand the history of software projects. In this thesis,
we take the first steps towards understanding change patterns in commit histories. We study the changes in software
metrics through the evolution of projects.
Through multiple exploratory studies, we conduct quantitative and qualitative experiments on several datasets extracted
from a pool of 13 projects. We mine the changes in software metrics for each commit of the respective projects and
manually build oracles to represent ground truth.
We identified several categories by analyzing these changes. One pattern, in particular, dubbed "tradeoffs", where some
metrics may improve at the expense of others, proved to be a promising indicator of design-related changes -- in some
cases, also hinting at a conscious design intent from the authors of the changes. Demonstrating the findings of our
exploratory studies, we build a general model to identify the application of a well-known set of design principles in
new projects.
Our overall results suggest that metric fluctuations have the potential to be relevant indicators for valuable
macroscopic insights about the design evolution in a project's development history
Recommended from our members
Mining software repositories to determine the impact of team factors on the structural attributes of software
This thesis was submitted for the award of PhD and was awarded by Brunel University LondonSoftware development is intrinsically a human activity and the role of the development team has been established as among the most decisive of all project success factors. Prior research has proven empirically that team size and stability are linked to stakeholder satisfaction, team productivity and fault-proneness. Team size is usually considered a measure of the number of developers that modify the source code of a project while team stability is typically a function of the cumulative time that each team member has worked with their fellow team members. There is, however, limited research investigating the impact of these factors on software maintainability - a crucial aspect given that up to 80% of development budgets are consumed in the maintenance phase of the lifecycle. This research sheds light on how these aspects of team composition influence the structural attributes of the developed software that, in turn, drive the maintenance costs of software. This thesis asserts that new and broader insights can be gained by measuring these internal attributes of the software rather than the more traditional approach of measuring its external attributes. This can also enable practitioners to measure and monitor key indicators throughout the development lifecycle taking remedial action where appropriate. Within this research the GoogleCode open-source forge is mined and a sample of 1,480 Java projects are selected for further study. Using the Chidamber and Kemerer design metrics suite, the impact of development team size and stability on the internal structural attributes of software is isolated and quantified. Drawing on prior research correlating these internal attributes with external attributes, the impact on maintainability is deduced. This research finds that those structural attributes that have been established to correlate to fault-proneness - coupling, cohesion and modularity - show degradation as team sizes increase or team stability decreases. That degradation in the internal attributes of the software is associated with a deterioration in the sub-attributes of maintainability; changeability, understandability, testability and stability
Interaction-Based Creation and Maintenance of Continuously Usable Trace Links
Traceability is a major concern for all software engineering artefacts. The core of traceability are trace links between the artefacts. Out of the links between all kinds of artefacts, trace links between requirements and source code are fundamental, since they enable the connection between the user point of view of a requirement and its actual implementation. Trace links are important for many software engineering tasks such as maintenance, program comprehension, verification, etc. Furthermore, the direct availability of trace links during a project improves the performance of developers.
The manual creation of trace links is too time-consuming to be practical. Thus, traceability research has a strong focus on automatic trace link creation. The most common automatic trace link creation methods use information retrieval techniques to measure the textual similarity between artefacts. The results of the textual similarity measurement is then used to judge the creation of links between artefacts. The application of such information retrieval techniques results in a lot of wrong link candidates and requires further expert knowledge to make the automatically created links usable, insomuch as it is necessary to manually vet the link candidates. This fact prevents the usage of information retrieval techniques to create trace links continuously and directly provide them to developers during a project.
Thus, this thesis addresses the problem of continuously providing trace links of a good quality to developers during a project and to maintain these links along with changing artefacts. To achieve this, a novel automatic trace link creation approach called Interaction Log Recording-based Trace Link Creation (ILog) has been designed and evaluated. ILog utilizes the interactions of developers with source code while implementing requirements. In addition, ILog uses the common development convention to provide issues' identifiers in a commit message, to assign recorded interactions to requirements. Thus ILog avoids additional manual efforts from the developers for link creation.
ILog has been implemented in a set of tools. The tools enable the recording of interactions in different integrated development environments and the subsequent creation of trace links. Trace link are created between source code files which have been touched by interactions and the current requirement which is being worked on. The trace links which are initially created in this way are further improved by utilizing interaction data such as interaction duration, frequency, type, etc. and source code structure, i.e. source code references between source code files involved in trace links. ILog's link improvement removes potentially wrong links and subsequently adds further correct links.
ILog was evaluated in three empirical studies using gold standards created by experts. One of the studies used data from an open source project. In the two other studies, student projects involving a real world customer were used. The results of the studies showed that ILog can create trace links with perfect precision and good recall, which enables the direct usage of the links. The studies also showed that the ILog approach has better precision and recall than other automatic trace link creation approaches, such as information retrieval.
To identify trace link maintenance capabilities suitable for the integration in ILog, a systematic literature review about trace link maintenance was performed. In the systematic literature review the trace link maintenance approaches which were found are discussed on the basis of a standardized trace link maintenance process. Furthermore, the extension of ILog with suitable trace link maintenance capabilities from the approaches found is illustrated
The Potential for Neutrino Physics at Muon Colliders and Dedicated High Current Muon Storage Rings
Conceptual design studies are underway for muon colliders and other
high-current muon storage rings that have the potential to become the first
true ``neutrino factories''. Muon decays in long straight sections of the
storage rings would produce precisely characterized beams of electron and muon
type neutrinos of unprecedented intensity. This article reviews the prospects
for these facilities to greatly extend our capabilities for neutrino
experiments, largely emphasizing the physics of neutrino interactions.Comment: 107 pages, 16 figures, to be published in Physics Report
Intelligent support for knowledge sharing in virtual communities
Virtual communities where people with common interests and goals communicate, share resources, and construct knowledge, are currently one of the fastest growing web environments. A common misconception is to believe that a virtual community will be effective when people and technology are present. Appropriate support for the effective functioning of online communities is paramount. In this line, personalisation and adaptation can play a crucial role, as illustrated by recent user modelling approaches that support social web-groups. However, personalisation research has mainly focused on adapting to the needs of individual members, as opposed to supporting communities to function as a whole. In this research, we argue that effective support tailored to virtual communities requires considering the wholeness of the community and facilitating the processes that influence the success of knowledge sharing and collaboration. We are focusing on closely knit communities that operate in the boundaries of organisations or in the educational sector. Following research in organisational psychology, we have identified several processes important for effective team functioning which can be applied to virtual communities and can be examined or facilitated by analysing community log data. Based on the above processes we defined a computational framework that consists of two major parts. The first deals with the extraction of a community model that represents the whole community and the second deals with the application of the model in order to identify what adaptive support is needed and when. The validation of this framework has been done using real virtual community data and the advantages of the adaptive support have been examined based on the changes happened after the interventions in the community combined with user feedback. With this thesis we contribute to the user modelling and adaptive systems research communities with: (a) a novel framework for holistic adaptive support in virtual communities, (b) a mechanism for extracting and maintaining a semantic community model based on the processes identified, and (c) deployment of the community model to identify problems and provide holistic support to a virtual community. We also contribute to the CSCW community with a novel approach in providing semantically enriched community awareness and to the area of social networks with a semantically enriched approach for modeling change patterns in a closely-knit VC.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
- …