Search CORE

31 research outputs found

Which models of the past are relevant to the present? A software effort estimation approach to exploiting useful past models

Author: B Boehm
B Kitchenham
B Kitchenham
C Bishop
C Bishop
C Lokan
C Lokan
E Kocaguneli
E Kocaguneli
J Cohen
J Demšar
J Wen
JZ Kolter
L Minku
Leandro L. Minku
LL Minku
LL Minku
LL Minku
M Auer
M Hall
M Jørgensen
M Jørgensen
M Shepperd
M Shepperd
ML Mitchell
P Sentas
R Tibshirani
S Amasaki
S Muthukrishnan
T DeMarco
TM Gruschke
VS Cherkassky
Xin Yao
Y Kultur
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Arhivske vijesti o pučkoj drami u srednjoj Dalmaciji

Author: Huijgens H
Lokan C
Minku LL
Van Deursen A
Publication venue: Croatian Academy of Science and Split Literary Circles and Arts
Publication date: 01/01/1985
Field of study

© 2017 Association for Computing Machinery. Context: The research literature on software development projects usually assumes that effort is a good proxy for cost. Practice, however, suggests that there are circumstances in which costs and effort should be distinguished. Objectives: We determine similarities and differences between size, effort, cost, duration, and number of defects of software projects. Method: We compare two established repositories (ISBSG and EBSPM) comprising almost 700 projects from industry. Results: We demonstrate a (log)-linear relation between cost on the one hand, and size, duration and number of defects on the other. This justifies conducting linear regression for cost. We establish that ISBSG is substantially different from EBSPM, in terms of cost (cheaper) and duration (faster), and the relation between cost and effort. We show that while in ISBSG effort is the most important cost factor, this is not the case in other repositories, such as EBSPM in which size is the dominant factor. Conclusion: Practitioners and researchers alike should be cautious when drawing conclusions from a single repository

Crossref

TU Delft Repository

University of Birmingham Research Portal

UNSWorks

Hrčak - Portal of scientific journals of Croatia

Leicester Research Archive

Concept Drift Detection in Discrete Streaming Data Using Probabilistic Graphical Models

Author: AR Masegosa
G Widmer
GI Webb
H Borchani
J Gama
J Gama
JC Schlimmer
JM Winn
LL Minku
SL Lauritzen
WL Buntine
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

Repositorio Institucional Universidad de Granada

confstream: automated algorithm selection and configuration of stream clustering algorithms

Author: A Bifet
A Bifet
B Veloso
F Hutter
F Hutter
H Fichtenberger
JN van Rijn
JN van Rijn
LL Minku
M Carnein
M Carnein
M Carnein
M Carnein
M López-Ibáñez
MR Ackermann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Machine learning has become one of the most important tools in data analysis. However, selecting the most appropriate machine learning algorithm and tuning its hyperparameters to their optimal values remains a difficult task. This is even more difficult for streaming applications where automated approaches are often not available to help during algorithm selection and configuration. This paper proposes the first approach for automated algorithm selection and configuration of stream clustering algorithms. We train an ensemble of different stream clustering algorithms and configurations in parallel and use the best performing configuration to obtain a clustering solution. By drawing new configurations from better performing ones, we are able to improve the ensemble performance over time. In large experiments on real and artificial data we show how our ensemble approach can improve upon default configurations and can also compete with a-posteriori algorithm configuration. Our approach is considerably faster than a-posteriori approaches and applicable in real-time. In addition, it is not limited to stream clustering and can be generalised to all streaming applications, including stream classification and regression

Crossref

Research Commons@Waikato

Data mining for software engineering and humans in the loop

Author: A Albrecht
A Corazza
A Tosun
AL Oliveira
B Turhan
BW Boehm
BW Boehm
E Kocaguneli
E Mendes
E Mendes
E Mendes
E Mendes
EJ Weyuker
H Gall
K Dejaeger
L Minku
LC Briand
LL Minku
M Jorgensen
M Jrgensen
M Shepperd
P Runeson
S Chulani
S Kim
T Hall
T Menzies
T Menzies
Y Kamei
Y Kultur
Y Miyazaki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The field of data mining for software engineering has been growing over the last decade. This field is concerned with the use of data mining to provide useful insights into how to improve software engineering processes and software itself, supporting decision-making. For that, data produced by software engineering processes and products during and after software development are used. Despite promising results, there is frequently a lack of discussion on the role of software engineering practitioners amidst the data mining approaches. This makes adoption of data mining by software engineering practitioners difficult. Moreover, the fact that experts’ knowledge is frequently ignored by data mining approaches, together with the lack of transparency of such approaches, can hinder the acceptability of data mining by software engineering practitioners. To overcome these problems, this position paper provides a discussion of the role of software engineering experts when adopting data mining approaches. It also argues that this role can be extended to increase experts’ involvement in the process of building data mining models. We believe that such extended involvement is not only likely to increase software engineers’ acceptability of the resulting models, but also improve the models themselves. We also provide some recommendations aimed at increasing the success of experts involvement and model acceptability

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

Brunel University Research Archive

Leicester Research Archive

A Brief Survey on Concept Drift

Author: LL Minku
Y Yeh
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Multi-Source Transfer Learning for Non-Stationary Environments

Author: H Du (7754438)
H Zhou (7704704)
LL Minku (7743608)
Publication venue
Publication date: 22/03/2019
Field of study

In data stream mining, predictive models typically suffer drops in predictive performance due to concept drift. As enough data representing the new concept must be collected for the new concept to be well learnt, the predictive performance of existing models usually takes some time to recover from concept drift. To speed up recovery from concept drift and improve predictive performance in data stream mining, this work proposes a novel approach called Multi-sourcE onLine TrAnsfer learning for Non-statIonary Environments (Melanie). Melanie is the first approach able to transfer knowledge between multiple data streaming sources in non-stationary environments. It creates several sub-classifiers to learn different aspects from different source and target concepts over time. The sub-classifiers that match the current target concept well are identified, and used to compose an ensemble for predicting examples from the target concept. We evaluate Melanie on several synthetic data streams containing different types of concept drift and on real world data streams. The results indicate that Melanie can deal with a variety drifts and improve predictive performance over existing data stream learning algorithms by making use of multiple sources

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Leicester Research Archive