3 research outputs found

    A Conceptual MAP of Software Process Improvement

    Get PDF
    Software organisations have for many years struggled to mature engineering practices using a variety of approaches. Over the last decade a new approach, known as software process improvement (SPI), has emerged and become widely used in the software industry. In this paper we position SPI in the landscape of initiatives that can be used in software organisations to mature their operations. A map is offered describing the characteristic features of SPI initiatives, the benefits and risks related to such initiatives, and the relations to complementary approaches to professionalise the industry. The map highlights management, approach, and perspective as three main concerns of SPI and lists three key ideas for each of these concerns. The map is based on an extensive survey of the SPI literature combined with experiences from SPI practice. Practitioners can use the map strategically to make decisions on whether to initiate SPI initiatives, to integrate SPI efforts with other improvement initiatives, and, more generally, to create and manage improvement programs based on SPI ideas. Researchers can use the map to identify key questions and areas of knowledge that can fruitfully inform SPI theory and practice

    Data cleaning techniques for software engineering data sets

    Get PDF
    Data quality is an important issue which has been addressed and recognised in research communities such as data warehousing, data mining and information systems. It has been agreed that poor data quality will impact the quality of results of analyses and that it will therefore impact on decisions made on the basis of these results. Empirical software engineering has neglected the issue of data quality to some extent. This fact poses the question of how researchers in empirical software engineering can trust their results without addressing the quality of the analysed data. One widely accepted definition for data quality describes it as `fitness for purpose', and the issue of poor data quality can be addressed by either introducing preventative measures or by applying means to cope with data quality issues. The research presented in this thesis addresses the latter with the special focus on noise handling. Three noise handling techniques, which utilise decision trees, are proposed for application to software engineering data sets. Each technique represents a noise handling approach: robust filtering, where training and test sets are the same; predictive filtering, where training and test sets are different; and filtering and polish, where noisy instances are corrected. The techniques were first evaluated in two different investigations by applying them to a large real world software engineering data set. In the first investigation the techniques' ability to improve predictive accuracy in differing noise levels was tested. All three techniques improved predictive accuracy in comparison to the do-nothing approach. The filtering and polish was the most successful technique in improving predictive accuracy. The second investigation utilising the large real world software engineering data set tested the techniques' ability to identify instances with implausible values. These instances were flagged for the purpose of evaluation before applying the three techniques. Robust filtering and predictive filtering decreased the number of instances with implausible values, but substantially decreased the size of the data set too. The filtering and polish technique actually increased the number of implausible values, but it did not reduce the size of the data set. Since the data set contained historical software project data, it was not possible to know the real extent of noise detected. This led to the production of simulated software engineering data sets, which were modelled on the real data set used in the previous evaluations to ensure domain specific characteristics. These simulated versions of the data set were then injected with noise, such that the real extent of the noise was known. After the noise injection the three noise handling techniques were applied to allow evaluation. This procedure of simulating software engineering data sets combined the incorporation of domain specific characteristics of the real world with the control over the simulated data. This is seen as a special strength of this evaluation approach. The results of the evaluation of the simulation showed that none of the techniques performed well. Robust filtering and filtering and polish performed very poorly, and based on the results of this evaluation they would not be recommended for the task of noise reduction. The predictive filtering technique was the best performing technique in this evaluation, but it did not perform significantly well either. An exhaustive systematic literature review has been carried out investigating to what extent the empirical software engineering community has considered data quality. The findings showed that the issue of data quality has been largely neglected by the empirical software engineering community. The work in this thesis highlights an important gap in empirical software engineering. It provided clarification and distinctions of the terms noise and outliers. Noise and outliers are overlapping, but they are fundamentally different. Since noise and outliers are often treated the same in noise handling techniques, a clarification of the two terms was necessary. To investigate the capabilities of noise handling techniques a single investigation was deemed as insufficient. The reasons for this are that the distinction between noise and outliers is not trivial, and that the investigated noise cleaning techniques are derived from traditional noise handling techniques where noise and outliers are combined. Therefore three investigations were undertaken to assess the effectiveness of the three presented noise handling techniques. Each investigation should be seen as a part of a multi-pronged approach. This thesis also highlights possible shortcomings of current automated noise handling techniques. The poor performance of the three techniques led to the conclusion that noise handling should be integrated into a data cleaning process where the input of domain knowledge and the replicability of the data cleaning process are ensured.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Modelo de calidad para el software orientado a objetos

    Get PDF
    El software ha obtenido en la actualidad una gran importancia en todos los ámbitos de la vida cotidiana. Es indudable que la calidad del software juega un papel fundamental en todo desarrollo informático, aunque en ocasiones no se le presta la suficiente atención, quizás debido a los relativamente escasos trabajos relacionados con este tema desarrollados hasta la fecha. En el presente trabajo, se plantea la necesidad de un modelo de calidad completo. Para cubrir esta necesidad se presenta un nuevo modelo de calidad, obtenido tras un estudio pormenorizado de los modelos de calidad existentes, centrado en el paradigma orientado a objetos. Este modelo de calidad muestra cómo la calidad del software se descompone en una serie de factores y éstos, a su vez, se descomponen en un conjunto de criterios medibles utilizando medidas. El modelo incluye un amplio conjunto de medidas, diseñadas especialmente para su aplicación dentro del paradigma orientado a objetos. Para completar el modelo, se ha diseñado un sencillo método de aplicación de este modelo de calidad para que pueda ser utilizado de una forma simple por los desarrolladores de sistemas informáticos orientados a objetos. El modelo de calidad definido se ha validado realizando un juego de experimentos. Estos experimentos han consistido en la aplicación del modelo sobre una serie de desarrollos orientados a objetos. Los resultados obtenidos han demostrado su utilidad práctica para determinar tanto la calidad global de los sistemas, como para identificar aquellas partes del sistema susceptibles de ser mejoradas. Con este trabajo, se llena un importante hueco existente en esta área, pues, en primer lugar, no existen modelos de calidad completos para la orientación a objetos. En segundo lugar, aunque hay medidas para la orientación a objetos, no se han asociado a los atributos que determinan la calidad del software, por lo que su utilidad, tal cual fueron definidas, resulta bastante cuestionable. Para finalizar, nunca se ha asociado un modelo de calidad con una método de aplicación, por lo que su utilidad quedaba considerablemente mermada, quedando a expensas de la habilidad y experiencia del Ingeniero del Software que lo utilizara
    corecore