180 research outputs found
Human vs. Algorithm
We consider the roles of algorithm and human and their
inter-relationships. As a vehicle for some of our ideas we
describe an empirical investigation of software professionals
using analogy-based tools and unaided search in order
to solve various prediction problems. We conclude that
there exist a class of software engineering problems which
might be characterised as high value and low frequency
where the human-algorithm interaction must be considered
carefully if they are to be successfully deployed in industry
An Investigation of Rule Induction Based Prediction Systems
Traditionally, researchers have used either off-the-shelf models such as COCOMO, or developed local models using statistical techniques such as stepwise regression, to predict software effort estimates. More recently, attention has turned to a variety of machine learning methods such as artificial neural networks (ANNs), case-based reasoning (CBR) and rule induction (RI). This position paper outlines some preliminary research into the use of rule induction methods to build software cost models. We briefly describe the use of rule induction methods and then apply the technique to a dataset of 81 software projects derived from a Canadian software house in the late 1980s. We show that RI methods tend to be unstable and generally predict with quite variable accuracy. Pruning the feature set, however, has a significant impact upon accuracy. We also compare our results with a prediction system based upon a standard regression procedure. We suggest that further work is carried out to examine the effects of the relationships among, and between, the features of the attributes on the generated rules in an attempt to improve on current prediction techniques and enhance our understanding of machine learning methods
Making Software Cost Data Available for Meta-Analysis
In this paper we consider the increasing need for meta-analysis within empirical software engineering. However, we also note that a necessary precondition to such forms of analysis is to have both the results in an appropriate format and sufficient contextual information to avoid misleading inferences. We consider the implications in the field of software project effort estimation and show that for a sample of 12 seemingly similar published studies, the results are difficult to compare let alone combine. This is due to different reporting conventions. We argue that a protocol is required and make some suggestions as to what it should contain
An Empirical Analysis of Software Productivity
The aim of our research is to discover what factors impact software project productivity (measured as function points per hour) using real world data. Within this overall goal we also compare productivity between different business sectors and project types. We analysed a data set of almost 700 projects that have been collected by STTF from a number of Finnish companies since 1978. These projects are quite diverse type (new and maintenance projects), in terms of
size (6 to over 5000 function points), effort (55 to over 60000 person hours), application domain and implementation technology. There are three main findings. First productivity varies enormously between projects. Second, project type has limited influence on productivity. Third,
application domain or business area has a major impact upon productivity. Because this data set is not a random sample generalisation is somewhat problematic, we hope that it contributes to an overall body of knowledge about software productivity and thereby facilitates the construction of a bigger picture
An Analysis of Data Sets Used to Train and Validate Cost Prediction Systems
OBJECTIVE - the aim of this investigation is to build up
a picture of the nature and type of data sets being used to
develop and evaluate different software project effort prediction systems. We believe this to be important since there is a growing body of published work that seeks to assess different prediction approaches. Unfortunately, results – to date – are rather inconsistent so we are interested in the extent to which this might be explained by different data sets.
METHOD - we performed an exhaustive search from 1980
onwards from three software engineering journals for research papers that used project data sets to compare cost
prediction systems.
RESULTS - this identified a total of 50 papers that used, one or more times, a total of 74 unique project data sets. We observed that some of the better known and publicly accessible data sets were used repeatedly making them potentially disproportionately influential. Such data sets also tend to be amongst the oldest with potential problems of obsolescence. We also note that only about 70% of all data sets are in the public domain and this can be particularly problematic when the data set description is incomplete or limited. Finally, extracting relevant information from research papers has been time consuming due to different styles of presentation and levels of contextural information.
CONCLUSIONS - we believe there are two lessons to learn.
First, the community needs to consider the quality and appropriateness of the data set being utilised; not all data sets are equal. Second, we need to assess the way results are presented in order to facilitate meta-analysis and whether a standard protocol would be appropriate
Understanding object feature binding through experimentation as a precursor to modelling
In order to explore underlying brain mechanisms and to further understand how and where object feature binding occurs, psychophysical data are analysed and will be modelled using an attractor network. This paper describes psychophysical work and an outline of the proposed model. A rapid serial visual processing paradigm with a post-cue response task was used in three experimental conditions: spatial, temporal and spatio-temporal. Using a ‘staircase’ procedure, stimulus onset asynchrony for each observer for each condition was set in practice trails to achieve ~50% error rates. Results indicate that spatial location information helps bind objects features and temporal location information hinders it. Our expectation is that the proposed neural model will demonstrate a binding mechanism by exhibiting regions of enhanced activity in the location of the target when presented with a partial post-cue. In future work, the model could be lesioned so that neuropsychological phenomena might be exhibited. In such a way, the mechanisms underlying object feature binding might be clarified
Recommended from our members
System architecture metrics: an evaluation
The research described in this dissertation is a study of the application of measurement, or metrics for software engineering. This is not in itself a new idea; the concept of measuring software was first mooted close on twenty years ago. However, examination of what is a considerable body of metrics work, reveals that incorporating measurement into software engineering is rather less straightforward than one might pre-suppose and despite the advancing years, there is still a lack of maturity.
The thesis commences with a dissection of three of the most popular metrics, namely Haistead's software science, McCabe's cyclomatic complexity and Henry and Kafura's information flow - all of which might be regarded as having achieved classic status. Despite their popularity these metrics are all flawed in at least three respects. First and foremost, in each case it is unclear exactly what is being measured: instead there being a preponderance of such metaphysical terms as complexIty and qualIty. Second, each metric is theoretically doubtful in that it exhibits anomalous behaviour. Third, much of the claimed empirical support for each metric is spurious arising from poor experimental design, and inappropriate statistical analysis. It is argued that these problems are not misfortune but the inevitable consequence of the ad hoc and unstructured approach of much metrics research: in particular the scant regard paid to the role of underlying models.
This research seeks to address these problems by proposing a systematic method for the development and evaluation of software metrics. The method is a goal directed, combination of formal modelling techniques, and empirical ealiat%or. The met\io s applied to the problem of developing metrics to evaluate software designs - from the perspective of a software engineer wishing to minimise implementation difficulties, faults and future maintenance problems. It highlights a number of weaknesses within the original model. These are tackled in a second, more sophisticated model which is multidimensional, that is it combines, in this case, two metrics. Both the theoretical and empirical analysis show this model to have utility in its ability to identify hardto- implement and unreliable aspects of software designs. It is concluded that this method goes some way towards the problem of introducing a little more rigour into the development, evaluation and evolution of metrics for the software engineer
Problem reports and team maturity in agile automotive software development
Background: Volvo Cars is pioneering an agile transformation on a large scale
in the automotive industry. Social psychological aspects of automotive software
development are an under-researched area in general. Few studies on team
maturity or group dynamics can be found specifically in the automotive software
engineering domain. Objective: This study is intended as an initial step to
fill that gap by investigating the connection between issues and problem
reports and team maturity. Method: We conducted a quantitative study with 84
participants from 14 teams and qualitatively validated the result with the
Release Train Engineer having an overview of all the participating teams.
Results: We find that the more mature a team is, the faster they seem to
resolve issues as provided through external feedback, at least in the two
initial team maturity stages. Conclusion: This study suggests that working on
team dynamics might increase productivity in modern automotive software
development departments, but this needs further investigation.Comment: 5 page
The impact of using biased performance metrics on software defect prediction research
Context: Software engineering researchers have undertaken many experiments
investigating the potential of software defect prediction algorithms.
Unfortunately, some widely used performance metrics are known to be
problematic, most notably F1, but nevertheless F1 is widely used.
Objective: To investigate the potential impact of using F1 on the validity of
this large body of research.
Method: We undertook a systematic review to locate relevant experiments and
then extract all pairwise comparisons of defect prediction performance using F1
and the un-biased Matthews correlation coefficient (MCC).
Results: We found a total of 38 primary studies. These contain 12,471 pairs
of results. Of these, 21.95% changed direction when the MCC metric is used
instead of the biased F1 metric. Unfortunately, we also found evidence
suggesting that F1 remains widely used in software defect prediction research.
Conclusions: We reiterate the concerns of statisticians that the F1 is a
problematic metric outside of an information retrieval context, since we are
concerned about both classes (defect-prone and not defect-prone units). This
inappropriate usage has led to a substantial number (more than one fifth) of
erroneous (in terms of direction) results. Therefore we urge researchers to (i)
use an unbiased metric and (ii) publish detailed results including confusion
matrices such that alternative analyses become possible.Comment: Submitted to the journal Information & Software Technology. It is a
greatly extended version of "Assessing Software Defection Prediction
Performance: Why Using the Matthews Correlation Coefficient Matters"
presented at EASE 202
An investigation of machine learning based prediction systems
Traditionally, researchers have used either o�f-the-shelf models such as COCOMO, or developed local models using statistical techniques such as stepwise regression, to obtain software eff�ort estimates. More recently, attention has turned to a variety of machine learning methods such as artifcial neural networks (ANNs), case-based reasoning (CBR) and rule induction (RI). This paper outlines some comparative research into the use of these three machine learning methods to build software e�ort prediction
systems. We briefly describe each method and then apply the techniques to a dataset of 81 software projects derived from a Canadian software house in the late 1980s. We compare the prediction systems in terms of three factors: accuracy, explanatory value and configurability. We show that ANN methods have superior accuracy and that RI methods are least accurate. However, this view is somewhat counteracted by problems with explanatory value and configurability. For example, we found that considerable
eff�ort was required to configure the ANN and that this compared very unfavourably with the other techniques, particularly CBR and least squares regression (LSR). We suggest that further work be carried out, both to further explore interaction between the enduser and the prediction system, and also to facilitate configuration, particularly of ANNs
- …