752 research outputs found

    Dependency Management 2.0 – A Semantic Web Enabled Approach

    Get PDF
    Software development and evolution are highly distributed processes that involve a multitude of supporting tools and resources. Application programming interfaces are commonly used by software developers to reduce development cost and complexity by reusing code developed by third-parties or published by the open source community. However, these application programming interfaces have also introduced new challenges to the Software Engineering community (e.g., software vulnerabilities, API incompatibilities, and software license violations) that not only extend beyond the traditional boundaries of individual projects but also involve different software artifacts. As a result, there is the need for a technology-independent representation of software dependency semantics and the ability to seamlessly integrate this representation with knowledge from other software artifacts. The Semantic Web and its supporting technology stack have been widely promoted to model, integrate, and support interoperability among heterogeneous data sources. This dissertation takes advantage of the Semantic Web and its enabling technology stack for knowledge modeling and integration. The thesis introduces five major contributions: (1) We present a formal Software Build System Ontology – SBSON, which captures concepts and properties for software build and dependency management systems. This formal knowledge representation allows us to take advantage of Semantic Web inference services forming the basis for a more flexibility API dependency analysis compared to traditional proprietary analysis approaches. (2) We conducted a user survey which involved 53 open source developers to allow us to gain insights on how actual developers manage API breaking changes. (3) We introduced a novel approach which integrates our SBSON model with knowledge about source code usage and changes within the Maven ecosystem to support API consumers and producers in managing (assessing and minimizing) the impacts of breaking changes. (4) A Security Vulnerability Analysis Framework (SV-AF) is introduced, which integrates builds system, source code, versioning system, and vulnerability ontologies to trace and assess the impact of security vulnerabilities across project boundaries. (5) Finally, we introduce an Ontological Trustworthiness Assessment Model (OntTAM). OntTAM is an integration of our build, source code, vulnerability and license ontologies which supports a holistic analysis and assessment of quality attributes related to the trustworthiness of libraries and APIs in open source systems. Several case studies are presented to illustrate the applicability and flexibility of our modelling approach, demonstrating that our knowledge modeling approach can seamlessly integrate and reuse knowledge extracted from existing build and dependency management systems with other existing heterogeneous data sources found in the software engineering domain. As part of our case studies, we also demonstrate how this unified knowledge model can enable new types of project dependency analysis

    Enhancing Trust –A Unified Meta-Model for Software Security Vulnerability Analysis

    Get PDF
    Over the last decade, a globalization of the software industry has taken place which has facilitated the sharing and reuse of code across existing project boundaries. At the same time, such global reuse also introduces new challenges to the Software Engineering community, with not only code implementation being shared across systems but also any vulnerabilities it is exposed to as well. Hence, vulnerabilities found in APIs no longer affect only individual projects but instead might spread across projects and even global software ecosystem borders. Tracing such vulnerabilities on a global scale becomes an inherently difficult task, with many of the resources required for the analysis not only growing at unprecedented rates but also being spread across heterogeneous resources. Software developers are struggling to identify and locate the required data to take full advantage of these resources. The Semantic Web and its supporting technology stack have been widely promoted to model, integrate, and support interoperability among heterogeneous data sources. This dissertation introduces four major contributions to address these challenges: (1) It provides a literature review of the use of software vulnerabilities databases (SVDBs) in the Software Engineering community. (2) Based on findings from this literature review, we present SEVONT, a Semantic Web based modeling approach to support a formal and semi-automated approach for unifying vulnerability information resources. SEVONT introduces a multi-layer knowledge model which not only provides a unified knowledge representation, but also captures software vulnerability information at different abstract levels to allow for seamless integration, analysis, and reuse of the modeled knowledge. The modeling approach takes advantage of Formal Concept Analysis (FCA) to guide knowledge engineers in identifying reusable knowledge concepts and modeling them. (3) A Security Vulnerability Analysis Framework (SV-AF) is introduced, which is an instantiation of the SEVONT knowledge model to support evidence-based vulnerability detection. The framework integrates vulnerability ontologies (and data) with existing Software Engineering ontologies allowing for the use of Semantic Web reasoning services to trace and assess the impact of security vulnerabilities across project boundaries. Several case studies are presented to illustrate the applicability and flexibility of our modelling approach, demonstrating that the presented knowledge modeling approach cannot only unify heterogeneous vulnerability data sources but also enables new types of vulnerability analysis

    Enhancement of Web Security Against External Attack

    Get PDF
    The security of web-based services is currently playing a vital role for the software industry. In recent years, many technologies and standards have emerged in order to handle the security issues related to web services. This paper shows techniques to enhance the security of web services, and some of the recent challenges and recommendations of a proposed model to secure web services. It shows the security process of a real life web application, which includes; HTML5 forms, login security, and a single signon solution. This paper also aim to discuss the ten (10) most common web security vulnerabilities and how to prevent the web application from three (3) of the vulnerabilities. Amongst them are; SQL Injection, Cross Site Scripting and Broken Authentication, and Session Management

    Vulnerable Open Source Dependencies: Counting Those That Matter

    Full text link
    BACKGROUND: Vulnerable dependencies are a known problem in today's open-source software ecosystems because OSS libraries are highly interconnected and developers do not always update their dependencies. AIMS: In this paper we aim to present a precise methodology, that combines the code-based analysis of patches with information on build, test, update dates, and group extracted from the very code repository, and therefore, caters to the needs of industrial practice for correct allocation of development and audit resources. METHOD: To understand the industrial impact of the proposed methodology, we considered the 200 most popular OSS Java libraries used by SAP in its own software. Our analysis included 10905 distinct GAVs (group, artifact, version) when considering all the library versions. RESULTS: We found that about 20% of the dependencies affected by a known vulnerability are not deployed, and therefore, they do not represent a danger to the analyzed library because they cannot be exploited in practice. Developers of the analyzed libraries are able to fix (and actually responsible for) 82% of the deployed vulnerable dependencies. The vast majority (81%) of vulnerable dependencies may be fixed by simply updating to a new version, while 1% of the vulnerable dependencies in our sample are halted, and therefore, potentially require a costly mitigation strategy. CONCLUSIONS: Our case study shows that the correct counting allows software development companies to receive actionable information about their library dependencies, and therefore, correctly allocate costly development and audit resources, which is spent inefficiently in case of distorted measurements.Comment: This is a pre-print of the paper that appears, with the same title, in the proceedings of the 12th International Symposium on Empirical Software Engineering and Measurement, 201

    Mining and linking crowd-based software engineering how-to screencasts

    Get PDF
    In recent years, crowd-based content in the form of screencast videos has gained in popularity among software engineers. Screencasts are viewed and created for different purposes, such as a learning aid, being part of a software project’s documentation, or as a general knowledge sharing resource. For organizations to remain competitive in attracting and retaining their workforce, they must adapt to these technological and social changes in software engineering practices. In this thesis, we propose a novel methodology for mining and integrating crowd-based multi- media content in existing workflows to help provide software engineers of different levels of experience and roles access to a documentation they are familiar with or prefer. As a result, we first aim to gain insights on how a user’s background and the task to be performed influence the use of certain documentation media. We focus on tutorial screencasts to identify their important information sources and provide insights on their usage, advantages, and disadvantages from a practitioner’s perspective. To that end, we conduct a survey of software engineers. We discuss how software engineers benefit from screencasts as well as challenges they face in using screencasts as project documentation. Our survey results revealed that screencasts and question and answers sites are among the most popular crowd-based information sources used by software engineers. Also, the level of experience and the role or reason for resorting to a documentation source affects the types of documentation used by software engineers. The results of our survey support our motivation in this thesis and show that for screencasts, high quality content and a narrator are very important components for users. Unfortunately, the binary format of videos makes analyzing video content difficult. As a result, dissecting and filtering multimedia information based on its relevance to a given project is an inherently difficult task. Therefore, it is necessary to provide automated approaches for mining and linking this crowd-based multimedia documentation to their relevant software artifacts. In this thesis, we apply LDA-based (Latent Dirichlet Allocation) mining approaches that take as input a set of screencast artifacts, such as GUI (Graphical User Interface) text (labels) and spoken words, to perform information extraction and, therefore, increase the availability of both textual and multimedia documentation for various stakeholders of a software product. For example, this allows screencasts to be linked to other software artifacts such as source code to help software developers/maintainers have access to the implementation details of an application feature. We also present applications of our proposed methodology that include: 1) an LDA-based mining approach that extracts use case scenarios in text format from screencasts, 2) an LDA-based approach that links screencasts to their relevant artifacts (e.g., source code), and 3) a Semantic Web-based approach to establish direct links between vulnerability exploitation screencasts and their relevant vulnerability descriptions in the National Vulnerability Database (NVD) and indirectly link screencasts to their relevant Maven dependencies. To evaluate the applicability of the proposed approach, we report on empirical case studies conducted on existing screencasts that describe different use case scenarios of the WordPress and Firefox open source applications or vulnerability exploitation scenarios

    A Knowledge Graph to Represent Software Vulnerabilities

    Get PDF
    Over the past decade, there has been a major shift towards the globalization of the software industry, by allowing code to be shared and reused across project boundaries. This global code reuse can take on various forms, include components or libraries which are publicly available on the Internet. However, this code reuse also comes with new challenges, since not only code but also vulnerabilities these components might be exposed to are shared. The software engineering community has attempted to address this challenge by introducing bug bounty platforms and software vulnerability repositories, to help organizations manage known vulnerabilities in their systems. However, with the ever-increasing number of vulnerabilities and information related to these vulnerabilities, it has become inherently more difficult to synthesize this knowledge. Knowledge Graphs and their supporting technology stack have been promoted as one possible solution to model, integrate, and support interoperability among heterogeneous data sources. In this thesis, we introduce a methodology that takes advantage of knowledge graphs to integrate resources related to known software vulnerabilities. More specifically, this thesis takes advantage of knowledge graphs to introduce a unified representation that transforms traditional information silos (e.g., VDBs, bug bounty programs) and transforms them in information hubs. Several use cases are presented to illustrate the applicability and flexibility of our modeling approach, demonstrating that the presented knowledge modeling approach can indeed unify heterogeneous vulnerability data sources and enable new types of vulnerability analysis

    A cloud, microservice-based digital twin for the Oil industry a flow assurance case study

    Get PDF
    Digital Twins are one of the top recent technology trends, considered a front-runner in enabling the next generation of intelligent, digitalized industries. A Digital Twin is basi cally a real enterprise asset mirrored into the virtual world, created entirely in software, and utilized to innovate business, generate new revenue, and create value-producing op portunities. Naturally, building such demanding systems is not an easy task. Many organi zational and technological concerns should be addressed, like real-time processing, data management, cybersecurity, and low latency communication. Moreover, Digital Twins should be extensible and support data integration with its physical counterpart, allowing enterprises to streamline their operations more safely. Considering there are no standard methodologies or technologies to design a Digital Twin, software developers struggle to develop a robust, reliable Digital Twin architecture that meets customers’ needs. Therefore, in consideration of these technical challenges, we decided to explore solutions in the oil and gas industry. Oil plants are known for having notoriously complex pro cesses and an oppressive ecosystem when it comes to the adoption of new technologies, especially due to its intricate and ever-changing landscape. Despite these difficulties, this industry would certainly benefit from a well-designed Digital Twin. Hence, this work proposes a solution for an Oil Well Digital Twin architecture. We decided to use the flow assurance process of an Oil Well as a base to build a Digital Twin, aiming to study and fulfill those requirements and technical limitations. We implemented and tested several APIs, each one representing the virtual version of a real Oil Well component, enforcing the importance of microservices and cloud-native patterns in its design. When function ing together, these microservices comprise an entire Oil Well Digital twin solution, cov ering basic storage, communication, and monitoring fundamentals. This proof of concept demonstrates how a complex asset — like an Oil Well — can be built and organized in a completely digital manner. Lastly, this design is extensible and opens up further develop ment opportunities.Gêmeos Digitais são uma das principais tendências tecnológicas recentes, considerados pioneiros na habilitação da próxima geração de indústrias inteligentes e digitalizadas. Um Gêmeo Digital é basicamente um ativo corporativo espelhado no mundo virtual, criado inteiramente em software e destinado a inovar os modelos operacionais e de negócios, ao mesmo tempo em que fornece novas oportunidades de geração de receita e valor. Naturalmente, construir sistemas tão exigentes não é uma tarefa fácil. Muitas preocupações organizacionais e tecnológicas devem ser abordadas, como processamento em tempo real, gerenciamento de dados, segurança cibernética e comunicação de baixa latência. Além disso, os Gêmeos Digitais também devem ser extensíveis e oferecer suporte à integração de dados com sua contraparte física, permitindo que as empresas agilizem suas opera ções com mais segurança. Considerando que não existem metodologias ou tecnologias padrão para projetar um Gêmeo Digital, os desenvolvedores de software tem dificuldade em desenvolver uma arquitetura de Gêmeo Digital robusta e confiável que atenda às ne cessidades dos clientes. Além desses desafios técnicos, a indústria de óleo e gás é conhecida por ter processos notoriamente complexos e um ecossistema opressor na adoção de novas tecnologias, prin cipalmente devido ao seu cenário intrincado e em constante mudança. Apesar dessas dificuldades, essa indústria certamente se beneficiaria de um Gêmeo Digital bem proje tado. Assim, este trabalho propõe uma solução de uma arquitetura para um Gêmeo Digital de um Poço de Petróleo. Decidimos usar o processo de garantia de vazão de um poço de petróleo como base para construir um Gêmeo Digital, visando estudar e atender a esses requisitos e limitações técnicas. Implementamos e testamos várias APIs, cada uma repre sentando a versão virtual de um componente real de um poço de petróleo, reforçando a importância de microsserviços e padrões nativos de nuvem em seu design. Ao funciona rem juntos, esses microsserviços compreendem uma solução completa de gêmeos digitais de poços de petróleo, abrangendo os fundamentos básicos de armazenamento, comunica ção e monitoramento. Esta prova de conceito demonstra como um ativo complexo – como um poço de petróleo – pode ser construído e organizado de forma totalmente digital. Por fim, esse design é extensível e abre mais oportunidades de desenvolvimento
    • …
    corecore