Search CORE

61 research outputs found

Data Mining for Software Engineering

Author: LIU Chao
LO David
Thummalapenta Suresh
XIE Tao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2009
Field of study

Institutional Knowledge at Singapore Management University

Data mining for software engineering and humans in the loop

Author: A Albrecht
A Corazza
A Tosun
AL Oliveira
B Turhan
BW Boehm
BW Boehm
E Kocaguneli
E Mendes
E Mendes
E Mendes
E Mendes
EJ Weyuker
H Gall
K Dejaeger
L Minku
LC Briand
LL Minku
M Jorgensen
M Jrgensen
M Shepperd
P Runeson
S Chulani
S Kim
T Hall
T Menzies
T Menzies
Y Kamei
Y Kultur
Y Miyazaki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The field of data mining for software engineering has been growing over the last decade. This field is concerned with the use of data mining to provide useful insights into how to improve software engineering processes and software itself, supporting decision-making. For that, data produced by software engineering processes and products during and after software development are used. Despite promising results, there is frequently a lack of discussion on the role of software engineering practitioners amidst the data mining approaches. This makes adoption of data mining by software engineering practitioners difficult. Moreover, the fact that experts’ knowledge is frequently ignored by data mining approaches, together with the lack of transparency of such approaches, can hinder the acceptability of data mining by software engineering practitioners. To overcome these problems, this position paper provides a discussion of the role of software engineering experts when adopting data mining approaches. It also argues that this role can be extended to increase experts’ involvement in the process of building data mining models. We believe that such extended involvement is not only likely to increase software engineers’ acceptability of the resulting models, but also improve the models themselves. We also provide some recommendations aimed at increasing the success of experts involvement and model acceptability

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

Brunel University Research Archive

Leicester Research Archive

Mining developer communication data streams

Author: Connor Andy M.
Finlay Jacqui
Pears Russel
Publication venue
Publication date: 22/07/2014
Field of study

This paper explores the concepts of modelling a software development project as a process that results in the creation of a continuous stream of data. In terms of the Jazz repository used in this research, one aspect of that stream of data would be developer communication. Such data can be used to create an evolving social network characterized by a range of metrics. This paper presents the application of data stream mining techniques to identify the most useful metrics for predicting build outcomes. Results are presented from applying the Hoeffding Tree classification method used in conjunction with the Adaptive Sliding Window (ADWIN) method for detecting concept drift. The results indicate that only a small number of the available metrics considered have any significance for predicting the outcome of a build

arXiv.org e-Print Archive

Crossref

git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories

Author: Gote Christoph
Scholtes Ingo
Schweitzer Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/03/2019
Field of study

Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts defined at the level of files, modules, or packages. While this approach has led to insights into the social aspects of software development, it neglects detailed information on code changes and code ownership, e.g. which exact lines of code have been authored by which developers, that is contained in the commit log of software projects. Addressing this issue, we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. This information allows us to construct directed, weighted, and time-stamped networks, where a link signifies that one developer has edited a block of source code originally written by another developer. Our tool is applied in case studies of an Open Source and a commercial software project. We argue that it opens up a massive new source of high-resolution data on human collaboration patterns.Comment: MSR 2019, 12 pages, 10 figure

arXiv.org e-Print Archive

ZORA

Research on Software Project Developer Behaviors with K-means Clustering Analysis

Author: Li Xiaozhou
Publication venue: CEUR-WS
Publication date: 17/02/2020
Field of study

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Mining Competences of Expert Estimators

Author: Gotovac Sven
Karna Hrvoje
Publication venue: AIS Electronic Library (AISeL)
Publication date: 15/10/2014
Field of study

This paper reports on a study conducted with intention to identify competences of employees engaged on software development projects that are responsible for reliable effort estimation. Execution of assigned project tasks engages different human characteristics and effort estimation is integral part of development process. Competences are defined as knowledge , skills and abilities required to perform job assignments. As input data we used company internal classification and collection of employee competences together with data sets of task effort estimates from ten projects executed in a department of the company specialized for development of IT solutions in telecom domain. Techniques used for modeling are proven data mining methods, the neural network and decision tree algorithms. Results provided mapping of competences to effort estimates and represent valuable knowledge discovery that can be used in practice for selection and evaluation of expert effort estimators

AIS Electronic Library (AISeL)

Data mining use for learning process design of an information source locator agent

Author: Böhm Christian
Chiotti Omar Juan Alfredo
Galli María Rosa
Publication venue
Publication date: 01/10/2002
Field of study

The aim of this work is to present a data mining application to software engineering. We describe the use of data mining in some parts of the design process of a dynamic decision support system agent-based architecture. The main function of this system is to guide information requirements from users to the domains that offer greater possibilities of answering them. For that purpose, a strategy is developed, which provides the system with capacity for analyzing an information requirement, and determining to which domains it will be directed. To learn from errors made during its operation, a learning mechanism based in CBR techniques is also proposed. On the one hand, by using data mining techniques it is possible to define a discriminating function to classify the system domains into two groups: those that can probably provide an answer to the information requirement made to the system, and those that cannot. On the other hand, the application of data mining to the cases base allows the specification of rules to settle relationships among the stored cases with the aim of inferring possible causes of error in the domains classification. In this way, a learning mechanism is designed to update the knowledge base and thus improve the already made classification as regards the values assigned to the discriminating function.Eje: Aprendizaje y reconocimiento de patronesRed de Universidades con Carreras en Informática (RedUNCI

Défis 2025

Author: Collet Philippe
Du Bousquet Lydie
Duchien Laurence
Moreau Pierre-Etienne
Publication venue: 'Lavoisier'
Publication date: 01/01/2015
Field of study

International audienceNew paradigms, languages, modeling, verification, testing approaches and new tools in the field of programming and software should be created in the next 10 years, whether to make life easier for designers and maintainers of computer systems, to model and reliable software or to anticipate technological change. This text summarizes the challenges in the Programming and Software Engineering field on the horizon 2025. This work has been presented and discussed during the national days of the Research Group on Programming and Software Engineering in June 2014 and in September 2014 in Paris.De nouveaux paradigmes, de nouveaux langages, de nouvelles approches de modélisation, de vérification, de tests et de nouveaux outils dans le domaine de la programmation et du logiciel devraient voir le jour dans les dix ans à venir, que ce soit pour faciliter la vie des concepteurs et mainteneurs de systèmes informatiques, pour modéliser et fiabiliser les logiciels ou encore pour devancer l’évolution technologique. Ce texte résume les travaux menés sur les défis du Génie de la Programmation et du Logiciel à l’horizon 2025. Ces travaux ont été l’occasion de présentations et d’échanges lors des journées nationales du Groupe de Recherche Génie de la Programmation et du Logiciel en juin 2014 et lors d’une journée en septembre 2014 à Paris

Hal - Université Grenoble Alpes

HAL-UNICE

INRIA a CCSD electronic archive server