9,864 research outputs found

    Predicting and Evaluating Software Model Growth in the Automotive Industry

    Full text link
    The size of a software artifact influences the software quality and impacts the development process. In industry, when software size exceeds certain thresholds, memory errors accumulate and development tools might not be able to cope anymore, resulting in a lengthy program start up times, failing builds, or memory problems at unpredictable times. Thus, foreseeing critical growth in software modules meets a high demand in industrial practice. Predicting the time when the size grows to the level where maintenance is needed prevents unexpected efforts and helps to spot problematic artifacts before they become critical. Although the amount of prediction approaches in literature is vast, it is unclear how well they fit with prerequisites and expectations from practice. In this paper, we perform an industrial case study at an automotive manufacturer to explore applicability and usability of prediction approaches in practice. In a first step, we collect the most relevant prediction approaches from literature, including both, approaches using statistics and machine learning. Furthermore, we elicit expectations towards predictions from practitioners using a survey and stakeholder workshops. At the same time, we measure software size of 48 software artifacts by mining four years of revision history, resulting in 4,547 data points. In the last step, we assess the applicability of state-of-the-art prediction approaches using the collected data by systematically analyzing how well they fulfill the practitioners' expectations. Our main contribution is a comparison of commonly used prediction approaches in a real world industrial setting while considering stakeholder expectations. We show that the approaches provide significantly different results regarding prediction accuracy and that the statistical approaches fit our data best

    Experience: Quality benchmarking of datasets used in software effort estimation

    Get PDF
    Data is a cornerstone of empirical software engineering (ESE) research and practice. Data underpin numerous process and project management activities, including the estimation of development effort and the prediction of the likely location and severity of defects in code. Serious questions have been raised, however, over the quality of the data used in ESE. Data quality problems caused by noise, outliers, and incompleteness have been noted as being especially prevalent. Other quality issues, although also potentially important, have received less attention. In this study, we assess the quality of 13 datasets that have been used extensively in research on software effort estimation. The quality issues considered in this article draw on a taxonomy that we published previously based on a systematic mapping of data quality issues in ESE. Our contributions are as follows: (1) an evaluation of the “fitness for purpose” of these commonly used datasets and (2) an assessment of the utility of the taxonomy in terms of dataset benchmarking. We also propose a template that could be used to both improve the ESE data collection/submission process and to evaluate other such datasets, contributing to enhanced awareness of data quality issues in the ESE community and, in time, the availability and use of higher-quality datasets

    Integrate the GM(1,1) and Verhulst models to predict software stage effort

    Get PDF
    This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Software effort prediction clearly plays a crucial role in software project management. In keeping with more dynamic approaches to software development, it is not sufficient to only predict the whole-project effort at an early stage. Rather, the project manager must also dynamically predict the effort of different stages or activities during the software development process. This can assist the project manager to reestimate effort and adjust the project plan, thus avoiding effort or schedule overruns. This paper presents a method for software physical time stage-effort prediction based on grey models GM(1,1) and Verhulst. This method establishes models dynamically according to particular types of stage-effort sequences, and can adapt to particular development methodologies automatically by using a novel grey feedback mechanism. We evaluate the proposed method with a large-scale real-world software engineering dataset, and compare it with the linear regression method and the Kalman filter method, revealing that accuracy has been improved by at least 28% and 50%, respectively. The results indicate that the method can be effective and has considerable potential. We believe that stage predictions could be a useful complement to whole-project effort prediction methods.National Natural Science Foundation of China and the Hi-Tech Research and Development Program of Chin

    Insights on Research Techniques towards Cost Estimation in Software Design

    Get PDF
    Software cost estimation is of the most challenging task in project management in order to ensuring smoother development operation and target achievement. There has been evolution of various standards tools and techniques for cost estimation practiced in the industry at present times. However, it was never investigated about the overall picturization of effectiveness of such techniques till date. This paper initiates its contribution by presenting taxonomies of conventional cost-estimation techniques and then investigates the research trends towards frequently addressed problems in it. The paper also reviews the existing techniques in well-structured manner in order to highlight the problems addressed, techniques used, advantages associated and limitation explored from literatures. Finally, we also brief the explored open research issues as an added contribution to this manuscript

    Strategic and Operational Management of Supplier Involvement in New Product Development: a Contingency Perspective

    Get PDF
    This paper examines how firms succeed to leverage supplier involvement in product development. The paper extends earlier work on managing supplier involvement by providing an integrated analysis of results, processes and conditions both at the level of individual development projects and the overall firm. Following a multiple-case study approach with theoretical sampling, the study is carried out by examining eight projects in which four manufacturers from different industries involve multiple suppliers. The findings suggest that successful supplier involvement is dependent on the coordinated design, execution and evaluation of strategic, long-term processes and operational, short-term management processes and the presence of enabling factors such as a cross-functional oriented organization. The required intensity of these processes and enablers depends on contingencies such as firm size and environmental uncertainty. In contrast with previous research, we find no indications that managing supplier involvement requires a different approach in highly innovative projects compared to less innovative projects.innovation;new product development;purchasing;supplier relations;R&D management

    Opinion mining with the SentWordNet lexical resource

    Get PDF
    Sentiment classification concerns the application of automatic methods for predicting the orientation of sentiment present on text documents. It is an important subject in opinion mining research, with applications on a number of areas including recommender and advertising systems, customer intelligence and information retrieval. SentiWordNet is a lexical resource of sentiment information for terms in the English language designed to assist in opinion mining tasks, where each term is associated with numerical scores for positive and negative sentiment information. A resource that makes term level sentiment information readily available could be of use in building more effective sentiment classification methods. This research presents the results of an experiment that applied the SentiWordNet lexical resource to the problem of automatic sentiment classification of film reviews. First, a data set of relevant features extracted from text documents using SentiWordNet was designed and implemented. The resulting feature set is then used as input for training a support vector machine classifier for predicting the sentiment orientation of the underlying film review. Several scenarios exploring variations on the parameters that generate the data set, outlier removal and feature selection were executed. The results obtained are compared to other methods documented in the literature. It was found that they are in line with other experiments that propose similar approaches and use the same data set of film reviews, indicating SentiWordNet could become an important resource for the task of sentiment classification. Considerations on future improvements are also presented based on a detailed analysis of classification results

    Evaluation bias in effort estimation

    Get PDF
    There exists a large number of software effort estimation methods in the literature and the space of possibilities [54] is yet to be fully explored. There is little conclusive evidence about the relative performance of such methods and many studies suffer from instability in their conclusions. As a result, the effort estimation literature lacks a stable ranking of such methods.;This research aims at providing a stable ranking of a large number of methods using data sets based on COCOMO features. For this task, the COSEEKMO tool [46] was further developed into a benchmarking tool and several well-known effort estimation methods, including model trees, linear regression methods, local calibration, and several newly developed methods were used in COSEEKMO for a thorough comparison. The problem of instability was further explored and the evaluation method used was identified as the cause of instability. Therefore, the existing evaluation bias was corrected through a new evaluation approach, which was non-parametric. The Mann-Whitney U test [42] is the non-parametric test used in this study, which introduced a great amount of stability in the results. Several evaluation criteria were tested in order to analyze their possible effects on the observed stability.;The conclusions made in this study were stable across different evaluation criteria, different data sets, and different random runs. As a result, a group of four methods were selected as the best effort estimation methods among the explored 312 combinations of methods. These four methods were all based on the local calibration procedure proposed by Boehm [4]. Furthermore, these methods were simpler and more effective than many other complex methods including the Wrapper [37] and model trees [60], which are well-known methods in the literature.;Therefore, while there exists no single universal best method for effort estimation, this study suggests applying the four methods reported here to the historical data and using the best performing method among these four to estimate the effort for future projects. In addition, this study provides a path for comparing other existing or new effort estimation methods with the currently explored methods. This path involves a systematic comparison of the performance of each method against all other methods, including the methods studied in this work, through a benchmarking tool such as COSEEKMO, and using the non-parametric Mann-Whitney U test
    • …
    corecore