15 research outputs found

    An Ontology Model to Support the Automated Evaluation of Software

    Get PDF
    Even though previous research has tried to model Software Engineering knowledge, focusing either on the entire discipline or on parts of it, we lack an integrated conceptual model for representing software evaluations, and we also lack the information related to them that supports their definition and enables their automation and reproducibility. This paper presents an extensible ontology model for representing software evaluations and evaluation campaigns, i.e., worldwide activities where a group of tools is evaluated according to a certain evaluation specification using common test data. During the development of the ontologies, we have reused current standards and models and have linked these ontologies with some renowned ones

    SEON: a pyramid of ontologies for software evolution and its applications

    Get PDF
    The Semantic Web provides a standardized, well-established framework to define and work with ontologies. It is especially apt for machine processing. However, researchers in the field of software evolution have not really taken advantage of that so far. In this paper, we address the potential of representing software evolution knowledge with ontologies and Semantic Web technology, such as Linked Data and automated reasoning. We present Seon, a pyramid of ontologies for software evolution, which describes stakeholders, their activities, artifacts they create, and the relations among all of them. We show the use of evolution-specific ontologies for establishing a shared taxonomy of software analysis services, for defining extensible meta-models, for explicitly describing relationships among artifacts, and for linking data such as code structures, issues (change requests), bugs, and basically any changes made to a system over time. For validation, we discuss three different approaches, which are backed by Seon and enable semantically enriched software evolution analysis. These techniques have been fully implemented as tools and cover software analysis with web services, a natural language query interface for developers, and large-scale software visualizatio

    Mining and linking crowd-based software engineering how-to screencasts

    Get PDF
    In recent years, crowd-based content in the form of screencast videos has gained in popularity among software engineers. Screencasts are viewed and created for different purposes, such as a learning aid, being part of a software project’s documentation, or as a general knowledge sharing resource. For organizations to remain competitive in attracting and retaining their workforce, they must adapt to these technological and social changes in software engineering practices. In this thesis, we propose a novel methodology for mining and integrating crowd-based multi- media content in existing workflows to help provide software engineers of different levels of experience and roles access to a documentation they are familiar with or prefer. As a result, we first aim to gain insights on how a user’s background and the task to be performed influence the use of certain documentation media. We focus on tutorial screencasts to identify their important information sources and provide insights on their usage, advantages, and disadvantages from a practitioner’s perspective. To that end, we conduct a survey of software engineers. We discuss how software engineers benefit from screencasts as well as challenges they face in using screencasts as project documentation. Our survey results revealed that screencasts and question and answers sites are among the most popular crowd-based information sources used by software engineers. Also, the level of experience and the role or reason for resorting to a documentation source affects the types of documentation used by software engineers. The results of our survey support our motivation in this thesis and show that for screencasts, high quality content and a narrator are very important components for users. Unfortunately, the binary format of videos makes analyzing video content difficult. As a result, dissecting and filtering multimedia information based on its relevance to a given project is an inherently difficult task. Therefore, it is necessary to provide automated approaches for mining and linking this crowd-based multimedia documentation to their relevant software artifacts. In this thesis, we apply LDA-based (Latent Dirichlet Allocation) mining approaches that take as input a set of screencast artifacts, such as GUI (Graphical User Interface) text (labels) and spoken words, to perform information extraction and, therefore, increase the availability of both textual and multimedia documentation for various stakeholders of a software product. For example, this allows screencasts to be linked to other software artifacts such as source code to help software developers/maintainers have access to the implementation details of an application feature. We also present applications of our proposed methodology that include: 1) an LDA-based mining approach that extracts use case scenarios in text format from screencasts, 2) an LDA-based approach that links screencasts to their relevant artifacts (e.g., source code), and 3) a Semantic Web-based approach to establish direct links between vulnerability exploitation screencasts and their relevant vulnerability descriptions in the National Vulnerability Database (NVD) and indirectly link screencasts to their relevant Maven dependencies. To evaluate the applicability of the proposed approach, we report on empirical case studies conducted on existing screencasts that describe different use case scenarios of the WordPress and Firefox open source applications or vulnerability exploitation scenarios

    A family of experiments to validate measures for UML activity diagrams of ETL processes in data warehouses

    Get PDF
    In data warehousing, Extract, Transform, and Load (ETL) processes are in charge of extracting the data from the data sources that will be contained in the data warehouse. Their design and maintenance is thus a cornerstone in any data warehouse development project. Due to their relevance, the quality of these processes should be formally assessed early in the development in order to avoid populating the data warehouse with incorrect data. To this end, this paper presents a set of measures with which to evaluate the structural complexity of ETL process models at the conceptual level. This study is, moreover, accompanied by the application of formal frameworks and a family of experiments whose aim is to theoretical and empirically validate the proposed measures, respectively. Our experiments show that the use of these measures can aid designers to predict the effort associated with the maintenance tasks of ETL processes and to make ETL process models more usable. Our work is based on Unified Modeling Language (UML) activity diagrams for modeling ETL processes, and on the Framework for the Modeling and Evaluation of Software Processes (FMESP) framework for the definition and validation of the measures.In data warehousing, Extract, Transform, and Load (ETL) processes are in charge of extracting the data from the data sources that will be contained in the data warehouse. Their design and maintenance is thus a cornerstone in any data warehouse development project. Due to their relevance, the quality of these processes should be formally assessed early in the development in order to avoid populating the data warehouse with incorrect data. To this end, this paper presents a set of measures with which to evaluate the structural complexity of ETL process models at the conceptual level. This study is, moreover, accompanied by the application of formal frameworks and a family of experiments whose aim is to theoretical and empirically validate the proposed measures, respectively. Our experiments show that the use of these measures can aid designers to predict the effort associated with the maintenance tasks of ETL processes and to make ETL process models more usable. Our work is based on Unified Modeling Language (UML) activity diagrams for modeling ETL processes, and on the Framework for the Modeling and Evaluation of Software Processes (FMESP) framework for the definition and validation of the measures

    Vista funcional del proceso de medición y evaluación

    Get PDF
    El modelado de procesos permite especificar vistas para comprender mejor aspectos de los procesos/actividades de una línea de producción y garantizar por ejemplo repetitividad, consistencia y automatización. Entre las distintas vistas de proceso citadas en la literatura se encuentran básicamente las siguientes: funcional, informacional, organizacional y de comportamiento. Por otra parte, las áreas de medición, evaluación y análisis han ganado en importancia, tanto en la academia como en la industria, como soporte a procesos principales de Ingeniería Software/Web. En el presente artículo se especifica principalmente la vista funcional de un modelo de proceso para medición, evaluación y análisis. Además analizamos la integración de esta vista de proceso con un marco conceptual de medición y evaluación existente y su instanciación como soporte tecnológico, y discutimos asimismo nuestra contribución con respecto a trabajos relacionados.Sociedad Argentina de Informática e Investigación Operativ

    Estrategias de medición y evaluación: diseño de un estudio comparativo

    Get PDF
    Este trabajo especifica un diseño de evaluación para comprender y comparar estrategias integradas de medición y evaluación, considerando a una estrategia como a un recurso de un proyecto, desde el punto de vista del ente a valorar. Nuestro objetivo es evaluar la calidad de las capacidades de una estrategia de medición y evaluación teniendo en cuenta tres fundamentos: 1) el marco conceptual centrado en una base terminológica, 2) el soporte explícito de proceso, y 3) el soporte metodológico/tecnológico. A su vez, para diseñar el estudio tuvimos en cuenta para una estrategia aspectos de requerimientos no funcionales, de medición, de evaluación, y de análisis y recomendación. Como resultado de esta investigación hemos especificado el diseño del estudio; esto es, el árbol de requerimientos en función de características y atributos, el diseño de las métricas que cuantifican a estos atributos y su interpretación por medio del diseño de indicadores. Como consecuencia, documentaremos en otro artículo la implementación del estudio comparativo que permitirá un análisis riguroso de debilidades y fortalezas de estrategias como GQM (Goal-QuestionMetric), GOCAME (Goal-Oriented Context-Aware Measurement and Evaluation), etc., con el fin de definir cursos de acción para mejorarlas.Sociedad Argentina de Informática e Investigación Operativ

    Process Productivity Improvements through Semantic and Linked Data Technologies

    Get PDF
    Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: José María Álvarez Rodríguez.- Secretario: Rafael Valencia García.- Vocal: Alejandro Rodríguez Gonzále

    Component-based software engineering: a quantitative approach

    Get PDF
    Dissertação apresentada para a obtenção do Grau de Doutor em Informática pela Universidade Nova de Lisboa, Faculdade de Ciências e TecnologiaBackground: Often, claims in Component-Based Development (CBD) are only supported by qualitative expert opinion, rather than by quantitative data. This contrasts with the normal practice in other sciences, where a sound experimental validation of claims is standard practice. Experimental Software Engineering (ESE) aims to bridge this gap. Unfortunately, it is common to find experimental validation efforts that are hard to replicate and compare, to build up the body of knowledge in CBD. Objectives: In this dissertation our goals are (i) to contribute to evolution of ESE, in what concerns the replicability and comparability of experimental work, and (ii) to apply our proposals to CBD, thus contributing to its deeper and sounder understanding. Techniques: We propose a process model for ESE, aligned with current experimental best practices, and combine this model with a measurement technique called Ontology-Driven Measurement (ODM). ODM is aimed at improving the state of practice in metrics definition and collection, by making metrics definitions formal and executable,without sacrificing their usability. ODM uses standard technologies that can be well adapted to current integrated development environments. Results: Our contributions include the definition and preliminary validation of a process model for ESE and the proposal of ODM for supporting metrics definition and collection in the context of CBD. We use both the process model and ODM to perform a series experimental works in CBD, including the cross-validation of a component metrics set for JavaBeans, a case study on the influence of practitioners expertise in a sub-process of component development (component code inspections), and an observational study on reusability patterns of pluggable components (Eclipse plug-ins). These experimental works implied proposing, adapting, or selecting adequate ontologies, as well as the formal definition of metrics upon each of those ontologies. Limitations: Although our experimental work covers a variety of component models and, orthogonally, both process and product, the plethora of opportunities for using our quantitative approach to CBD is far from exhausted. Conclusions: The main contribution of this dissertation is the illustration, through practical examples, of how we can combine our experimental process model with ODM to support the experimental validation of claims in the context of CBD, in a repeatable and comparable way. In addition, the techniques proposed in this dissertation are generic and can be applied to other software development paradigms.Departamento de Informática of the Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa (FCT/UNL); Centro de Informática e Tecnologias da Informação of the FCT/UNL; Fundação para a Ciência e Tecnologia through the STACOS project(POSI/CHS/48875/2002); The Experimental Software Engineering Network (ESERNET);Association Internationale pour les Technologies Objets (AITO); Association forComputing Machinery (ACM

    A Data Quality Framework for the Secondary Use of Electronic Health Information

    Get PDF
    University of Minnesota Ph.D. dissertation. April 2016. Major: Health Informatics. Advisor: Bonnie Westra. 1 computer file (PDF); ix, 101 pages.Electronic health record (EHR) systems are designed to replace paper charts and facilitate the delivery of care. Since EHR data is now readily available in electronic form, it is increasingly used for other purposes. This is expected to improve health outcomes for patients; however, the benefits will only be realized if the data that is captured in the EHR is of sufficient quality to support these secondary uses. This research demonstrated that a healthcare data quality framework can be developed that produces metrics that characterize underlying EHR data quality and it can be used to quantify the impact of data quality issues on the correctness of the intended use of the data. The framework described in this research defined a Data Quality (DQ) Ontology and implemented an assessment method. The DQ Ontology was developed by mining the healthcare data quality literature for important terms used to discuss data quality concepts and these terms were harmonized into an ontology. Four high-level data quality dimensions (CorrectnessMeasure, ConsistencyMeasure, CompletenessMeasure and CurrencyMeasure) categorized 19 lower level Measures. The ontology serves as an unambiguous vocabulary and allows more precision when discussing healthcare data quality. The DQ Ontology is expressed with sufficient rigor that it can be used for logical inference and computation. The data quality framework was used to characterize data quality of an EHR for 10 data quality Measures. The results demonstrate that data quality can be quantified and Metrics can track data quality trends over time and for specific domain concepts. The DQ framework produces scalar quantities which can be computed on individual domain concepts and can be meaningfully aggregated at different levels of an information model. The data quality assessment process was also used to quantify the impact of data quality issues on a task. The EHR data was systematically degraded and a measure of the impact on the correctness of CMS178 eMeasure (Urinary Catheter Removal after Surgery) was computed. This information can help healthcare organizations prioritize data quality improvement efforts to focus on the areas that are most important and determine if the data can support its intended use

    Dependency Management 2.0 – A Semantic Web Enabled Approach

    Get PDF
    Software development and evolution are highly distributed processes that involve a multitude of supporting tools and resources. Application programming interfaces are commonly used by software developers to reduce development cost and complexity by reusing code developed by third-parties or published by the open source community. However, these application programming interfaces have also introduced new challenges to the Software Engineering community (e.g., software vulnerabilities, API incompatibilities, and software license violations) that not only extend beyond the traditional boundaries of individual projects but also involve different software artifacts. As a result, there is the need for a technology-independent representation of software dependency semantics and the ability to seamlessly integrate this representation with knowledge from other software artifacts. The Semantic Web and its supporting technology stack have been widely promoted to model, integrate, and support interoperability among heterogeneous data sources. This dissertation takes advantage of the Semantic Web and its enabling technology stack for knowledge modeling and integration. The thesis introduces five major contributions: (1) We present a formal Software Build System Ontology – SBSON, which captures concepts and properties for software build and dependency management systems. This formal knowledge representation allows us to take advantage of Semantic Web inference services forming the basis for a more flexibility API dependency analysis compared to traditional proprietary analysis approaches. (2) We conducted a user survey which involved 53 open source developers to allow us to gain insights on how actual developers manage API breaking changes. (3) We introduced a novel approach which integrates our SBSON model with knowledge about source code usage and changes within the Maven ecosystem to support API consumers and producers in managing (assessing and minimizing) the impacts of breaking changes. (4) A Security Vulnerability Analysis Framework (SV-AF) is introduced, which integrates builds system, source code, versioning system, and vulnerability ontologies to trace and assess the impact of security vulnerabilities across project boundaries. (5) Finally, we introduce an Ontological Trustworthiness Assessment Model (OntTAM). OntTAM is an integration of our build, source code, vulnerability and license ontologies which supports a holistic analysis and assessment of quality attributes related to the trustworthiness of libraries and APIs in open source systems. Several case studies are presented to illustrate the applicability and flexibility of our modelling approach, demonstrating that our knowledge modeling approach can seamlessly integrate and reuse knowledge extracted from existing build and dependency management systems with other existing heterogeneous data sources found in the software engineering domain. As part of our case studies, we also demonstrate how this unified knowledge model can enable new types of project dependency analysis
    corecore