Search CORE

13 research outputs found

Making inferences with small numbers of training sets

Author: Albrecht
Boehm
C. Kirsopp
Ebert
Efron
Kadoda
Kemerer
Kitchenham
Kitchenham
Kok
M. Shepperd
MacDonell
Mair
Miyazaki
Shepperd
Shepperd
Srinivasan
Walston
Wittig
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2002
Field of study

A potential methodological problem with empirical studies that assess project effort prediction system is discussed. Frequently, a hold-out strategy is deployed so that the data set is split into a training and a validation set. Inferences are then made concerning the relative accuracy of the different prediction techniques under examination. This is typically done on very small numbers of sampled training sets. It is shown that such studies can lead to almost random results (particularly where relatively small effects are being studied). To illustrate this problem, two data sets are analysed using a configuration problem for case-based prediction and results generated from 100 training sets. This enables results to be produced with quantified confidence limits. From this it is concluded that in both cases using less than five training sets leads to untrustworthy results, and ideally more than 20 sets should be deployed. Unfortunately, this raises a question over a number of empirical validations of prediction techniques, and so it is suggested that further research is needed as a matter of urgency

CiteSeerX

Crossref

Brunel University Research Archive

Recommended from our members

Estimating software project effort using analogies

Author: Schofield C
Shepperd MJ
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1997
Field of study

Accurate project effort prediction is an important goal for the software engineering community. To date most work has focused upon building algorithmic models of effort, for example COCOMO. These can be calibrated to local environments. We describe an alternative approach to estimation based upon the use of analogies. The underlying principle is to characterise projects in terms of features (for example, the number of interfaces, the development method or the size of the functional requirements document). Completed projects are stored and then the problem becomes one of finding the most similar projects to the one for which a prediction is required. Similarity is defined as Euclidean distance in n-dimensional space where n is the number of project features. Each dimension is standardised so all dimensions have equal weight. The known effort values of the nearest neighbours to the new project are then used as the basis for the prediction. The process is automated using a PC based tool known as ANGEL. The method is validated on nine different industrial datasets (a total of 275 projects) and in all cases analogy outperforms algorithmic models based upon stepwise regression. From this work we argue that estimation by analogy is a viable technique that, at the very least, can be used by project managers to complement current estimation techniques

Brunel University Research Archive

Quantum Software Analytics: Opportunities and Challenges

Author: Bi Tingting
Chen Shiping
Dam Hoa Khanh
Hoang Thong
Lu Qinghua
Nguyen Lam Duc
Xing Zhenchang
Zhu Liming
Publication venue
Publication date: 20/07/2023
Field of study

Quantum computing systems depend on the principles of quantum mechanics to perform multiple challenging tasks more efficiently than their classical counterparts. In classical software engineering, the software life cycle is used to document and structure the processes of design, implementation, and maintenance of software applications. It helps stakeholders understand how to build an application. In this paper, we summarize a set of software analytics topics and techniques in the development life cycle that can be leveraged and integrated into quantum software application development. The results of this work can assist researchers and practitioners in better understanding the quantum-specific emerging development activities, challenges, and opportunities in the next generation of quantum software

arXiv.org e-Print Archive

Calibrating a software cost estimation model : why and how

Author: Ceulenaere Astrid M.E.
Genuchten van, M.J.I.M.
Heemstra F.J.
Publication venue
Publication date: 01/01/1987
Field of study

Pure OAI Repository

Calibrating a software cost estimation model: why and how

Author: AME Cuelenaere
Boehm
Brooks
Cuelenaere
FJ Heemstra
Harmon
Hayes-Roth
Heemstra
Jackson
Jensen
Kemerer
Kitchenham
Liebowitz
Miyazaki
MJIM van Genuchten
Noth
Putnam
RCA PRICE Systems
Rubin
Saalfrank
Sierevelt
Waterman
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Software maintenance cost estimation with fourth generation languages

Author: Lamb Raymond K.
Publication venue
Publication date: 01/01/1997
Field of study

This thesis addresses the problem of allocation of software maintenance resources in a commercial environment using fourth generation language systems. The activity of maintaining software has a poor image amongst software managers, as it often appears that there is no end product. This image will only improve when software maintenance can be discussed in business terms, one of the main reasons being that the maintenance costs can then be compared to the costs of not maintaining the system. Software maintenance will continue to exist in the fourth generation environment, as systems will still be required to evolve. Cost estimation is an imprecise science, as there are many variables such as human, technical, environmental and political which can effect the ultimate costs of software and the resources required to maintain it. Some of the factors appear more obvious than others, for example an experienced programmer can achieve a specific task in less time than an inexperienced one. To fully estimate software maintenance costs these factors need to be identified and weights assigned to them. This thesis examines a means to identify these factors and their weights, and produces the first cut of an equation which will enable the software maintenance resources in a fourth generation language to be estimated

Durham e-Theses

An Empirical investigation into software effort estimation by analogy

Author: Schofield Christopher
Publication venue
Publication date
Field of study

Most practitioners recognise the important part accurate estimates of development effort play in the successful management of major software projects. However, it is widely recognised that current estimation techniques are often very inaccurate, while studies (Heemstra 1992; Lederer and Prasad 1993) have shown that effort estimation research is not being effectively transferred from the research domain into practical application. Traditionally, research has been almost exclusively focused on the advancement of algorithmic models (e.g. COCOMO (Boehm 1981) and SLIM (Putnam 1978)), where effort is commonly expressed as a function of system size. However, in recent years there has been a discernible movement away from algorithmic models with non-algorithmic systems (often encompassing machine learning facets) being actively researched. This is potentially a very exciting and important time in this field, with new approaches regularly being proposed. One such technique, estimation by analogy, is the focus of this thesis. The principle behind estimation by analogy is that past experience can often provide insights and solutions to present problems. Software projects are characterised in terms of collectable features (such as the number of screens or the size of the functional requirements) and stored in a historical case base as they are completed. Once a case base of sufficient size has been cultivated, new projects can be estimated by finding similar historical projects and re-using the recorded effort. To make estimation by analogy feasible it became necessary to construct a software tool, dubbed ANGEL, which allowed the collection of historical project data and the generation of estimates for new software projects. A substantial empirical validation of the approach was made encompassing approximately 250 real historical software projects across eight industrial data sets, using stepwise regression as a benchmark. Significance tests on the results accepted the hypothesis (at the 1% confidence level) that estimation by analogy is a superior prediction system to stepwise regression in terms of accuracy. A study was also made of the sensitivity of the analogy approach. By growing project data sets in a pseudo time-series fashion it was possible to answer pertinent questions about the approach, such as, what are the effects of outlying projects and what is the minimum data set size? The main conclusions of this work are that estimation by analogy is a viable estimation technique that would seem to offer some advantages over algorithmic approaches including, improved accuracy, easier use of categorical features and an ability to operate even where no statistical relationships can be found

Bournemouth University Research Online

Object-oriented software development effort prediction using design patterns from object interaction analysis

Author: Adekile Olusegun
Publication venue
Publication date: 15/05/2009
Field of study

Software project management is arguably the most important activity in modern software development projects. In the absence of realistic and objective management, the software development process cannot be managed in an effective way. Software development effort estimation is one of the most challenging and researched problems in project management. With the advent of object-oriented development, there have been studies to transpose some of the existing effort estimation methodologies to the new development paradigm. However, there is not in existence a holistic approach to estimation that allows for the refinement of an initial estimate produced in the requirements gathering phase through to the design phase. A SysML point methodology is proposed that is based on a common, structured and comprehensive modeling language (OMG SysML) that factors in the models that correspond to the primary phases of object-oriented development into producing an effort estimate. This dissertation presents a Function Point-like approach, named Pattern Point, which was conceived to estimate the size of object-oriented products using the design patterns found in object interaction modeling from the late OO analysis phase. In particular, two measures are proposed (PP1 and PP2) that are theoretically validated showing that they satisfy wellknown properties necessary for size measures. An initial empirical validation is performed that is meant to assess the usefulness and effectiveness of the proposed measures in predicting the development effort of object-oriented systems. Moreover, a comparative analysis is carried out; taking into account several other size measures. The experimental results show that the Pattern Point measure can be effectively used during the OOA phase to predict the effort values with a high degree of confidence. The PP2 metric yielded the best results with an aggregate PRED (0.25) = 0.874

Texas A&M Repository

Productivity prediction model based on Bayesian analysis and productivity console

Author: Yun Seok Jun
Publication venue: Texas A&M University
Publication date: 29/08/2005
Field of study

Software project management is one of the most critical activities in modern software development projects. Without realistic and objective management, the software development process cannot be managed in an effective way. There are three general problems in project management: effort estimation is not accurate, actual status is difficult to understand, and projects are often geographically dispersed. Estimating software development effort is one of the most challenging problems in project management. Various attempts have been made to solve the problem; so far, however, it remains a complex problem. The error rate of a renowned effort estimation model can be higher than 30% of the actual productivity. Therefore, inaccurate estimation results in poor planning and defies effective control of time and budgets in project management. In this research, we have built a productivity prediction model which uses productivity data from an ongoing project to reevaluate the initial productivity estimate and provides managers a better productivity estimate for project management. The actual status of the software project is not easy to understand due to problems inherent in software project attributes. The project attributes are dispersed across the various CASE (Computer-Aided Software Engineering) tools and are difficult to measure because they are not hard material like building blocks. In this research, we have created a productivity console which incorporates an expert system to measure project attributes objectively and provides graphical charts to visualize project status. The productivity console uses project attributes gathered in KB (Knowledge Base) of PAMPA II (Project Attributes Monitoring and Prediction Associate) that works with CASE tools and collects project attributes from the databases of the tools. The productivity console and PAMPA II work on a network, so geographically dispersed projects can be managed via the Internet without difficulty

Texas A&M Repository