89 research outputs found

    Making inferences with small numbers of training sets

    Get PDF
    A potential methodological problem with empirical studies that assess project effort prediction system is discussed. Frequently, a hold-out strategy is deployed so that the data set is split into a training and a validation set. Inferences are then made concerning the relative accuracy of the different prediction techniques under examination. This is typically done on very small numbers of sampled training sets. It is shown that such studies can lead to almost random results (particularly where relatively small effects are being studied). To illustrate this problem, two data sets are analysed using a configuration problem for case-based prediction and results generated from 100 training sets. This enables results to be produced with quantified confidence limits. From this it is concluded that in both cases using less than five training sets leads to untrustworthy results, and ideally more than 20 sets should be deployed. Unfortunately, this raises a question over a number of empirical validations of prediction techniques, and so it is suggested that further research is needed as a matter of urgency

    Measuring the impact of computer resource quality on the software development process and product

    Get PDF
    The availability and quality of computer resources during the software development process was speculated to have measurable, significant impact on the efficiency of the development process and the quality of the resulting product. Environment components such as the types of tools, machine responsiveness, and quantity of direct access storage may play a major role in the effort to produce the product and in its subsequent quality as measured by factors such as reliability and ease of maintenance. During the past six years, the NASA Goddard Space Flight Center has conducted experiments with software projects in an attempt to better understand the impact of software development methodologies, environments, and general technologies on the software process and product. Data was extracted and examined from nearly 50 software development projects. All were related to support of satellite flight dynamics ground-based computations. The relationship between computer resources and the software development process and product as exemplified by the subject NASA data was examined. Based upon the results, a number of computer resource-related implications are provided

    Calculation and use of an environment's characteristic software metric set

    Get PDF
    Since both cost/quality and production environments differ, this study presents an approach for customizing a characteristic set of software metrics to an environment. The approach is applied in the Software Engineering Laboratory (SEL), a NASA Goddard production environment, to 49 candidate process and product metrics of 652 modules from six (51,000 to 112,000 lines) projects. For this particular environment, the method yielded the characteristic metric set (source lines, fault correction effort per executable statement, design effort, code effort, number of I/O parameters, number of versions). The uses examined for a characteristic metric set include forecasting the effort for development, modification, and fault correction of modules based on historical data

    Monitoring software development through dynamic variables

    Get PDF
    Research conducted by the Software Engineering Laboratory (SEL) on the use of dynamic variables as a tool to monitor software development is described. Project independent measures which may be used in a management tool for monitoring software development are identified. Several FORTRAN projects with similar profiles are examined. The staff was experienced in developing these types of projects. The projects developed serve similar functions. Because these projects are similar some underlying relationships exist that are invariant between projects. These relationships, once well defined, may be used to compare the development of different projects to determine whether they are evolving the same way previous projects in this environment evolved

    Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation

    Get PDF
    Missing data is a widespread problem that can affect the ability to use data to construct effective prediction systems. We investigate a common machine learning technique that can tolerate missing values, namely C4.5, to predict cost using six real world software project databases. We analyze the predictive performance after using the k-NN missing data imputation technique to see if it is better to tolerate missing data or to try to impute missing values and then apply the C4.5 algorithm. For the investigation, we simulated three missingness mechanisms, three missing data patterns, and five missing data percentages. We found that the k-NN imputation can improve the prediction accuracy of C4.5. At the same time, both C4.5 and k-NN are little affected by the missingness mechanism, but that the missing data pattern and the missing data percentage have a strong negative impact upon prediction (or imputation) accuracy particularly if the missing data percentage exceeds 40%

    Software development: A paradigm for the future

    Get PDF
    A new paradigm for software development that treats software development as an experimental activity is presented. It provides built-in mechanisms for learning how to develop software better and reusing previous experience in the forms of knowledge, processes, and products. It uses models and measures to aid in the tasks of characterization, evaluation and motivation. An organization scheme is proposed for separating the project-specific focus from the organization's learning and reuse focuses of software development. The implications of this approach for corporations, research and education are discussed and some research activities currently underway at the University of Maryland that support this approach are presented

    PROPERTY DEVELOPMENT:AN EVALUATION OF DECISION SUPPORT SYSTEMS

    Get PDF
    Although there are a number of development approaches proposed in the decision support systems (DSS) literature, there appears to be a preference for prototyping over structured approaches. This paper describes the analysis stage of a DSS for the Property Development Department (PDS) at the Palmerston North City Council, New Zealand. The PDS role involves many ill-structured decisions with a large number of stakeholders. The paper describes the selection process for the methodology, analysing criteria for selection, and proposes a structured process for this analysis. The paper provides insights into when structured approaches are more appropriate for the development of DSS based on the type and complexity of the decisions supporte

    REBEE- Reusability Based Effort Estimation Technique using Dynamic Neural Network

    Get PDF
    Software Effort Estimation has been researched for over 25 years but until today no real effective model could be designed that could efficiently gauge the effort required for heterogeneous project data. Reusability factors of software development have been used to design a new effort estimation model called REBEE. This encompasses the usage of Fuzzy Logic and Dynamic Neural Networks. The experimental evaluation of the model depicts efficient effort estimation over varied project types

    An Approach for Effort Estimation having Reusable Components in Software Development

    Get PDF
    Estimation of the effort required for development has been researched for over 25 years now. Still there exists no concrete solution to estimate the development effort. Prior experience in similar type of projects is a key for business today. This paper proposes an Effort Estimation Model named REBEE based on the reusable matrices to effectively estimate the effort to be involved for development. A project is assumed to consist of multiple modules and the reusability factor of each module is considered in the technique described here. REBEE utilizes fuzzy logic and dynamic neural networks to achieve its goal. Based on the experimental evaluation discussed in this paper it is evident that this model accurately predicts the effort involved on heterogeneous project types

    A Theory Of Small Program Complexity

    Get PDF
    Small programs are those which are written and understood by one. person. Large software systems usually consist of many small programs. The complexity of a small program is a prediction of how difficult it would be for someone to understand the program. This complexity depends of three factors: (1) the size and interelationships of the program itself; (2) the size and interelationships of the internal model of the program\u27s purpose held by the person trying to understand the program; and (3) the complexity of the mapping between the model and the program. A theory of small program complexity based on these three factors is presented. The theory leads to several testable predictions. Experiments are described which test these predictions and whose results could verify or destroy the theory. © 1982, ACM. All rights reserved
    • 

    corecore