3 research outputs found

    Quality of Design, Analysis and Reporting of Software Engineering Experiments:A Systematic Review

    Get PDF
    Background: Like any research discipline, software engineering research must be of a certain quality to be valuable. High quality research in software engineering ensures that knowledge is accumulated and helpful advice is given to the industry. One way of assessing research quality is to conduct systematic reviews of the published research literature. Objective: The purpose of this work was to assess the quality of published experiments in software engineering with respect to the validity of inference and the quality of reporting. More specifically, the aim was to investigate the level of statistical power, the analysis of effect size, the handling of selection bias in quasi-experiments, and the completeness and consistency of the reporting of information regarding subjects, experimental settings, design, analysis, and validity. Furthermore, the work aimed at providing suggestions for improvements, using the potential deficiencies detected as a basis. Method: The quality was assessed by conducting a systematic review of the 113 experiments published in nine major software engineering journals and three conference proceedings in the decade 1993-2002. Results: The review revealed that software engineering experiments were generally designed with unacceptably low power and that inadequate attention was paid to issues of statistical power. Effect sizes were sparsely reported and not interpreted with respect to their practical importance for the particular context. There seemed to be little awareness of the importance of controlling for selection bias in quasi-experiments. Moreover, the review revealed a need for more complete and standardized reporting of information, which is crucial for understanding software engineering experiments and judging their results. Implications: The consequence of low power is that the actual effects of software engineering technologies will not be detected to an acceptable extent. The lack of reporting of effect sizes and the improper interpretation of effect sizes result in ignorance of the practical importance, and thereby the relevance to industry, of experimental results. The lack of control for selection bias in quasi-experiments may make these experiments less credible than randomized experiments. This is an unsatisfactory situation, because quasi-experiments serve an important role in investigating cause-effect relationships in software engineering, for example, in industrial settings. Finally, the incomplete and unstandardized reporting makes it difficult for the reader to understand an experiment and judge its results. Conclusions: Insufficient quality was revealed in the reviewed experiments. This has implications for inferences drawn from the experiments and might in turn lead to the accumulation of erroneous information and the offering of misleading advice to the industry. Ways to improve this situation are suggested

    A Usability Model for Software Development Processes and Practices

    Get PDF
    La usabilidad caracteriza buenas interacciones entre las personas y sus procesos y prácticas. Promueve la satisfacción y crea entornos seguros para la innovación. Los principios de usabilidad como el feedback y la tolerancia a errores están presentes en muchos conceptos de ingeniería de software, como los procesos iterativos y las revisiones de pares. El propósito de la investigación realizada para esta Tesis es traer el concepto de usabilidad de prácticas y procesos a la ingeniería de software. Para lograr este objetivo, y dada la falta de modelos de calidad de procesos enfocados en la usabilidad, un Modelo de Usabilidad de Prácticas y Procesos (UMP) ha sido creado, refinado y evaluado, siguiendo el marco Desing Science Research. UMP ha sido efectivamente aplicado a Scrum, Test Driven Development (TDD), Integración Continua, Behaviour Driven Development (BDD) y el método Visual Milestone Planning (VMP). UMP fue diseñado para ayudar a practicantes, coaches, consultores, docentes e investigadores. Para evaluar UMP se realizaron varios estudios empíricos: una evaluación de expertos inicial para determinar su factibilidad; un focus group para obtener feedback sobre las características y métricas de UMP; dos estudios de confiabilidad, un estudio de acuerdo entre evaluadores sobre Scrum y un estudio de confiabilidad entre evaluadores sobre TDD-BDD; y dos estudios para evaluar la utilidad de UMP, un estudio de caso sobre la aplicación de UMP al método VMP, y un cuasi-experimento de campo en el cual un equipo de desarrollo en la industria aplicó UMP para mejorar su práctica de BDD. Los resultados de los estudios de utilidad muestran que los usuarios consideran a UMP útil, y 37 evaluaciones independientes por expertos fueron realizadas sobre procesos y prácticas del mundo real. Las contribuciones de esta tesis incluyen: UMP con sus características y métricas, el proceso de evaluación de UMP, el conocimiento creado sobre la confiabilidad y utilidad de UMP a través de los estudios empíricos, y los perfiles que caracterizan la usabilidad de prácticas y procesos de amplio uso actual en la industria como Scrum, Integración Continua, TDD y BDD, obtenidos a través de la aplicación de UMP.Asesor científico: Alejandro Oliveros.Facultad de Informátic
    corecore