13 research outputs found

    Efficient Prior Publication Identification for Open Source Code

    Full text link
    Free/Open Source Software (FOSS) enables large-scale reuse of preexisting software components. The main drawback is increased complexity in software supply chain management. A common approach to tame such complexity is automated open source compliance, which consists in automating the verication of adherence to various open source management best practices about license obligation fulllment, vulnerability tracking, software composition analysis, and nearby concerns.We consider the problem of auditing a source code base to determine which of its parts have been published before, which is an important building block of automated open source compliance toolchains. Indeed, if source code allegedly developed in house is recognized as having been previously published elsewhere, alerts should be raised to investigate where it comes from and whether this entails that additional obligations shall be fullled before product shipment.We propose an ecient approach for prior publication identication that relies on a knowledge base of known source code artifacts linked together in a global Merkle direct acyclic graph and a dedicated discovery protocol. We introduce swh-scanner, a source code scanner that realizes the proposed approach in practice using as knowledge base Software Heritage, the largest public archive of source code artifacts. We validate experimentally the proposed approach, showing its eciency in both abstract (number of queries) and concrete terms (wall-clock time), performing benchmarks on 16 845 real-world public code bases of various sizes, from small to very large

    Assisting Software Developers With License Compliance

    Get PDF
    Open source licensing determines how open source systems are reused, distributed, and modified from a legal perspective. While it facilitates rapid development, it can present difficulty for developers in understanding due to the legal language of these licenses. Because of misunderstandings, systems can incorporate licensed code in a way that violates the terms of the license. Such incompatibilities between licensing can result in the inability to reuse a particular library without either relicensing the system or redesigning the architecture of the system. Prior efforts have predominantly focused on license identification or understanding the underlying phenomena without reasoning about compatibility in a broad scale. The work in this dissertation first investigates the rationale of developers and identifies the areas that developers struggle with respect to free/open source software licensing. First, we investigate the diffusion of licenses and the prevalence of license changes in a large scale empirical study of 16,221 Java systems. We observed a clear lack of traceability and a lack of standardized licensing that led to difficulties and confusion for developers trying to reuse source code. We further investigated the difficulty by surveying the developers of the systems with license changes to understand why they first adopted a license and then changed licenses. Additionally, we performed an analysis on issue trackers and legal mailing lists to extract licensing bugs. From these works, we identified key areas in which developers struggled and needed support. While developers need support to identify license incompatibilities and understand both the cause and implications of the incompatibilities, we observed that state-of-the-art license identification tools did not identify license exceptions. Since these exceptions directly modify the license terms (either the permissions granted by the license or the restrictions imposed by the license), we proposed an approach to complement current license identification techniques in order to classify license exceptions. The approach relies on supervised machine learners to classify the licensing text to identify the particular license exceptions or the lack of a license exception. Subsequently, we built an infrastructure to assist developers with evaluating license compliance warnings for their system. The infrastructure evaluates compliance across the dependency tree of a system to ensure it is compliant with all of the licenses of the dependencies. When an incompatibility is present, it notes the specific library/libraries and the conflicting license(s) so that the developers can investigate these compliance warnings, which would prevent distribution of their software, in their system. We conduct a study on 121,094 open source projects spanning 6 programming languages, and we demonstrate that the infrastructure is able to identify license incompatibilities between these projects and their dependencies

    Open Source Law, Policy and Practice

    Get PDF
    This book examines various policies, including the legal and commercial aspects of the Open Source phenomenon. Here, ‘Open Source’ is adopted as convenient shorthand for a collection of diverse users and communities, whose differences can be as great as their similarities. The common thread is their reliance on, and use of, law and legal mechanisms to govern the source code they write, use, and distribute. The central fact of open source is that maintaining control over source code relies on the existence and efficacy of intellectual property (‘IP’) laws, particularly copyright law. Copyright law is the primary statutory tool that achieves the end of openness, although implemented through private law arrangements at varying points within the software supply chain. This dependent relationship is itself a cause of concern for some philosophically in favour of ‘open’, with some predicting (or hoping) that the free software movement will bring about the end of copyright as a means for protecting software

    Open Source Law, Policy and Practice

    Get PDF
    This book examines various policies, including the legal and commercial aspects of the Open Source phenomenon. Here, ‘Open Source’ is adopted as convenient shorthand for a collection of diverse users and communities, whose differences can be as great as their similarities. The common thread is their reliance on, and use of, law and legal mechanisms to govern the source code they write, use, and distribute. The central fact of open source is that maintaining control over source code relies on the existence and efficacy of intellectual property (‘IP’) laws, particularly copyright law. Copyright law is the primary statutory tool that achieves the end of openness, although implemented through private law arrangements at varying points within the software supply chain. This dependent relationship is itself a cause of concern for some philosophically in favour of ‘open’, with some predicting (or hoping) that the free software movement will bring about the end of copyright as a means for protecting software

    Oswaldo: A Semantic Web Enabled Approach for Identifying Open Source License Violations

    Get PDF
    Open source license violations are numerous, multifaceted, and pose significant risk to developers and companies in the form of litigation, sometimes resulting in millions in dollars in damages or settlements. Free/Libre and Open Source Licenses utilize copyright law and are written in legalese, which is often outside the scope of a developer’s expertise. Software Engineers commit violations of these licenses’ terms and conditions easily and often unknowingly. Consequently, increased knowledge, better tools, and sound processes to detect and prevent license violations are extremely important. This work is an investigation in the types of potential license violations that are committed, through direct and transitive dependency hierarchies in hundreds of thousands of real-world software projects. This thesis contributes a novel approach, entitled Oswaldo, that defines and detects three types of license conflicts: Type 1 Simple Violation, Type 2 Embedded Violations, Type 3 Compound Violations. Unidirectional compatibility/incompatibility relationships of major licenses are modelled. Ontologies and Linked Data are advantageously exploited to detect transitive violation Types 2 and 3, as well as the direct violation Type 1. This thesis also reports initial evaluations of these three types of license violations found in the Maven repository

    Modeling a Consortium-based Distributed Ledger Network with Applications for Intelligent Transportation Infrastructure

    Get PDF
    Emerging distributed-ledger networks are changing the landscape for environments of low trust among participating entities. Implementing such technologies in transportation infrastructure communications and operations would enable, in a secure fashion, decentralized collaboration among entities who do not fully trust each other. This work models a transportation records and events data collection system enabled by a Hyperledger Fabric blockchain network and simulated using a transportation environment modeling tool. A distributed vehicle records management use case is shown with the capability to detect and prevent unauthorized vehicle odometer tampering. Another use case studied is that of vehicular data collected during the event of an accident. It relies on broadcast data collected from the Vehicle Ad-hoc Network (VANET) and submitted as witness reports from nearby vehicles or road-side units who observed the event taking place or detected misbehaving activity by vehicles involved in the accident. Mechanisms for the collection, validation, and corroboration of the reported data which may prove crucial for vehicle accident forensics are described and their implementation is discussed. A performance analysis of the network under various loads is conducted with results suggesting that tailored endorsement policies are an effective mechanism to improve overall network throughput for a given channel. The experimental testbed shows that Hyperledger Fabric and other distributed ledger technologies hold promise for the collection of transportation data and the collaboration of applications and services that consume it

    Dependency Management 2.0 – A Semantic Web Enabled Approach

    Get PDF
    Software development and evolution are highly distributed processes that involve a multitude of supporting tools and resources. Application programming interfaces are commonly used by software developers to reduce development cost and complexity by reusing code developed by third-parties or published by the open source community. However, these application programming interfaces have also introduced new challenges to the Software Engineering community (e.g., software vulnerabilities, API incompatibilities, and software license violations) that not only extend beyond the traditional boundaries of individual projects but also involve different software artifacts. As a result, there is the need for a technology-independent representation of software dependency semantics and the ability to seamlessly integrate this representation with knowledge from other software artifacts. The Semantic Web and its supporting technology stack have been widely promoted to model, integrate, and support interoperability among heterogeneous data sources. This dissertation takes advantage of the Semantic Web and its enabling technology stack for knowledge modeling and integration. The thesis introduces five major contributions: (1) We present a formal Software Build System Ontology – SBSON, which captures concepts and properties for software build and dependency management systems. This formal knowledge representation allows us to take advantage of Semantic Web inference services forming the basis for a more flexibility API dependency analysis compared to traditional proprietary analysis approaches. (2) We conducted a user survey which involved 53 open source developers to allow us to gain insights on how actual developers manage API breaking changes. (3) We introduced a novel approach which integrates our SBSON model with knowledge about source code usage and changes within the Maven ecosystem to support API consumers and producers in managing (assessing and minimizing) the impacts of breaking changes. (4) A Security Vulnerability Analysis Framework (SV-AF) is introduced, which integrates builds system, source code, versioning system, and vulnerability ontologies to trace and assess the impact of security vulnerabilities across project boundaries. (5) Finally, we introduce an Ontological Trustworthiness Assessment Model (OntTAM). OntTAM is an integration of our build, source code, vulnerability and license ontologies which supports a holistic analysis and assessment of quality attributes related to the trustworthiness of libraries and APIs in open source systems. Several case studies are presented to illustrate the applicability and flexibility of our modelling approach, demonstrating that our knowledge modeling approach can seamlessly integrate and reuse knowledge extracted from existing build and dependency management systems with other existing heterogeneous data sources found in the software engineering domain. As part of our case studies, we also demonstrate how this unified knowledge model can enable new types of project dependency analysis

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers

    A Framework for Facilitating Secure Design and Development of IoT Systems

    Get PDF
    The term Internet of Things (IoT) describes an ever-growing ecosystem of physical objects or things interconnected with each other and connected to the Internet. IoT devices consist of a wide range of highly heterogeneous inanimate and animate objects. Thus, a thing in the context of the IoT can even mean a person with blood pressure or heart rate monitor implant or a pet with a biochip transponder. IoT devices range from ordinary household appliances, such as smart light bulbs or smart coffee makers, to sophisticated tools for industrial automation. IoT is currently leading a revolutionary change in many industries and, as a result, a lot of industries and organizations are adopting the paradigm to gain a competitive edge. This allows them to boost operational efficiency and optimize system performance through real-time data management, which results in an optimized balance between energy usage and throughput. Another important application area is the Industrial Internet of Things (IIoT), which is the application of the IoT in industrial settings. This is also referred to as the Industrial Internet or Industry 4.0, where Cyber- Physical Systems (CPS) are interconnected using various technologies to achieve wireless control as well as advanced manufacturing and factory automation. IoT applications are becoming increasingly prevalent across many application domains, including smart healthcare, smart cities, smart grids, smart farming, and smart supply chain management. Similarly, IoT is currently transforming the way people live and work, and hence the demand for smart consumer products among people is also increasing steadily. Thus, many big industry giants, as well as startup companies, are competing to dominate the market with their new IoT products and services, and hence unlocking the business value of IoT. Despite its increasing popularity, potential benefits, and proven capabilities, IoT is still in its infancy and fraught with challenges. The technology is faced with many challenges, including connectivity issues, compatibility/interoperability between devices and systems, lack of standardization, management of the huge amounts of data, and lack of tools for forensic investigations. However, the state of insecurity and privacy concerns in the IoT are arguably among the key factors restraining the universal adoption of the technology. Consequently, many recent research studies reveal that there are security and privacy issues associated with the design and implementation of several IoT devices and Smart Applications (smart apps). This can be attributed, partly, to the fact that as some IoT device makers and smart apps development companies (especially the start-ups) reap business value from the huge IoT market, they tend to neglect the importance of security. As a result, many IoT devices and smart apps are created with security vulnerabilities, which have resulted in many IoT related security breaches in recent years. This thesis is focused on addressing the security and privacy challenges that were briefly highlighted in the previous paragraph. Given that the Internet is not a secure environ ment even for the traditional computer systems makes IoT systems even less secure due to the inherent constraints associated with many IoT devices. These constraints, which are mainly imposed by cost since many IoT edge devices are expected to be inexpensive and disposable, include limited energy resources, limited computational and storage capabilities, as well as lossy networks due to the much lower hardware performance compared to conventional computers. While there are many security and privacy issues in the IoT today, arguably a root cause of such issues is that many start-up IoT device manufacturers and smart apps development companies do not adhere to the concept of security by design. Consequently, some of these companies produce IoT devices and smart apps with security vulnerabilities. In recent years, attackers have exploited different security vulnerabilities in IoT infrastructures which have caused several data breaches and other security and privacy incidents involving IoT devices and smart apps. These have attracted significant attention from the research community in both academia and industry, resulting in a surge of proposals put forward by many researchers. Although research approaches and findings may vary across different research studies, the consensus is that a fundamental prerequisite for addressing IoT security and privacy challenges is to build security and privacy protection into IoT devices and smart apps from the very beginning. To this end, this thesis investigates how to bake security and privacy into IoT systems from the onset, and as its main objective, this thesis particularly focuses on providing a solution that can foster the design and development of secure IoT devices and smart apps, namely the IoT Hardware Platform Security Advisor (IoT-HarPSecA) framework. The security framework is expected to provide support to designers and developers in IoT start-up companies during the design and implementation of IoT systems. IoT-HarPSecA framework is also expected to facilitate the implementation of security in existing IoT systems. To accomplish the previously mentioned objective as well as to affirm the aforementioned assertion, the following step-by-step problem-solving approach is followed. The first step is an exhaustive survey of different aspects of IoT security and privacy, including security requirements in IoT architecture, security threats in IoT architecture, IoT application domains and their associated cyber assets, the complexity of IoT vulnerabilities, and some possible IoT security and privacy countermeasures; and the survey wraps up with a brief overview of IoT hardware development platforms. The next steps are the identification of many challenges and issues associated with the IoT, which narrowed down to the abovementioned fundamental security/privacy issue; followed by a study of different aspects of security implementation in the IoT. The remaining steps are the framework design thinking process, framework design and implementation, and finally, framework performance evaluation. IoT-HarPSecA offers three functionality features, namely security requirement elicitation security best practice guidelines for secure development, and above all, a feature that recommends specific Lightweight Cryptographic Algorithms (LWCAs) for both software and hardware implementations. Accordingly, IoT-HarPSecA is composed of three main components, namely Security Requirements Elicitation (SRE) component, Security Best Practice Guidelines (SBPG) component, and Lightweight Cryptographic Algorithms Recommendation (LWCAR) component, each of them servicing one of the aforementioned features. The author has implemented a command-line tool in C++ to serve as an interface between users and the security framework. This thesis presents a detailed description, design, and implementation of the SRE, SBPG, and LWCAR components of the security framework. It also presents real-world practical scenarios that show how IoT-HarPSecA can be used to elicit security requirements, generate security best practices, and recommend appropriate LWCAs based on user inputs. Furthermore, the thesis presents performance evaluation of the SRE, SBPG, and LWCAR components framework tools, which shows that IoT-HarPSecA can serve as a roadmap for secure IoT development.O termo Internet das coisas (IoT) é utilizado para descrever um ecossistema, em expansão, de objetos físicos ou elementos interconetados entre si e à Internet. Os dispositivos IoT consistem numa gama vasta e heterogénea de objetos animados ou inanimados e, neste contexto, podem pertencer à IoT um indivíduo com um implante que monitoriza a frequência cardíaca ou até mesmo um animal de estimação que tenha um biochip. Estes dispositivos variam entre eletrodomésticos, tais como máquinas de café ou lâmpadas inteligentes, a ferramentas sofisticadas de uso na automatização industrial. A IoT está a revolucionar e a provocar mudanças em várias indústrias e muitas adotam esta tecnologia para incrementar as suas vantagens competitivas. Este paradigma melhora a eficiência operacional e otimiza o desempenho de sistemas através da gestão de dados em tempo real, resultando num balanço otimizado entre o uso energético e a taxa de transferência. Outra área de aplicação é a IoT Industrial (IIoT) ou internet industrial ou Indústria 4.0, ou seja, uma aplicação de IoT no âmbito industrial, onde os sistemas ciberfísicos estão interconectados a diversas tecnologias de forma a obter um controlo de rede sem fios, bem como fabricações avançadas e automatização fabril. As aplicações da IoT estão a crescer e a tornarem-se predominantes em muitos domínios de aplicação inteligentes como sistemas de saúde, cidades, redes, agricultura e sistemas de fornecimento. Da mesma forma, a IoT está a transformar estilos de vida e de trabalho e assim, a procura por produtos inteligentes está constantemente a aumentar. As grandes indústrias e startups competem entre si de forma a dominar o mercado com os seus novos serviços e produtos IoT, desbloqueando o valor de negócio da IoT. Apesar da sua crescente popularidade, benefícios e capacidades comprovadas, a IoT está ainda a dar os seus primeiros passos e é confrontada com muitos desafios. Entre eles, problemas de conectividade, compatibilidade/interoperabilidade entre dispositivos e sistemas, falta de padronização, gestão das enormes quantidades de dados e ainda falta de ferramentas para investigações forenses. No entanto, preocupações quanto ao estado de segurança e privacidade ainda estão entre os fatores adversos à adesão universal desta tecnologia. Estudos recentes revelaram que existem questões de segurança e privacidade associadas ao design e implementação de vários dispositivos IoT e aplicações inteligentes (smart apps.), isto pode ser devido ao facto, em parte, de que alguns fabricantes e empresas de desenvolvimento de dispositivos (especialmente startups) IoT e smart apps., recolham o valor de negócio dos grandes mercados IoT, negligenciando assim a importância da segurança, resultando em dispositivos IoT e smart apps. com carências e violações de segurança da IoT nos últimos anos. Esta tese aborda os desafios de segurança e privacidade que foram supra mencionados. Visto que a Internet e os sistemas informáticos tradicionais são por vezes considerados inseguros, os sistemas IoT tornam-se ainda mais inseguros, devido a restrições inerentes a tais dispositivos. Estas restrições são impostas devido ao custo, uma vez que se espera que muitos dispositivos de ponta sejam de baixo custo e descartáveis, com recursos energéticos limitados, bem como limitações na capacidade de armazenamento e computacionais, e redes com perdas devido a um desempenho de hardware de qualidade inferior, quando comparados com computadores convencionais. Uma das raízes do problema é o facto de que muitos fabricantes, startups e empresas de desenvolvimento destes dispositivos e smart apps não adiram ao conceito de segurança por construção, ou seja, logo na conceção, não preveem a proteção da privacidade e segurança. Assim, alguns dos produtos e dispositivos produzidos apresentam vulnerabilidades na segurança. Nos últimos anos, hackers maliciosos têm explorado diferentes vulnerabilidades de segurança nas infraestruturas da IoT, causando violações de dados e outros incidentes de privacidade envolvendo dispositivos IoT e smart apps. Estes têm atraído uma atenção significativa por parte das comunidades académica e industrial, que culminaram num grande número de propostas apresentadas por investigadores científicos. Ainda que as abordagens de pesquisa e os resultados variem entre os diferentes estudos, há um consenso e pré-requisito fundamental para enfrentar os desafios de privacidade e segurança da IoT, que buscam construir proteção de segurança e privacidade em dispositivos IoT e smart apps. desde o fabrico. Para esta finalidade, esta tese investiga como produzir segurança e privacidade destes sistemas desde a produção, e como principal objetivo, concentra-se em fornecer soluções que possam promover a conceção e o desenvolvimento de dispositivos IoT e smart apps., nomeadamente um conjunto de ferramentas chamado Consultor de Segurança da Plataforma de Hardware da IoT (IoT-HarPSecA). Espera-se que o conjunto de ferramentas forneça apoio a designers e programadores em startups durante a conceção e implementação destes sistemas ou que facilite a integração de mecanismos de segurança nos sistemas préexistentes. De modo a alcançar o objetivo proposto, recorre-se à seguinte abordagem. A primeira fase consiste num levantamento exaustivo de diferentes aspetos da segurança e privacidade na IoT, incluindo requisitos de segurança na arquitetura da IoT e ameaças à sua segurança, os seus domínios de aplicação e os ativos cibernéticos associados, a complexidade das vulnerabilidades da IoT e ainda possíveis contramedidas relacionadas com a segurança e privacidade. Evolui-se para uma breve visão geral das plataformas de desenvolvimento de hardware da IoT. As fases seguintes consistem na identificação dos desafios e questões associadas à IoT, que foram restringidos às questões de segurança e privacidade. As demais etapas abordam o processo de pensamento de conceção (design thinking), design e implementação e, finalmente, a avaliação do desempenho. O IoT-HarPSecA é composto por três componentes principais: a Obtenção de Requisitos de Segurança (SRE), Orientações de Melhores Práticas de Segurança (SBPG) e a recomendação de Componentes de Algoritmos Criptográficos Leves (LWCAR) na implementação de software e hardware. O autor implementou uma ferramenta em linha de comandos usando linguagem C++ que serve como interface entre os utilizadores e a IoT-HarPSecA. Esta tese apresenta ainda uma descrição detalhada, desenho e implementação das componentes SRE, SBPG, e LWCAR. Apresenta ainda cenários práticos do mundo real que demostram como o IoT-HarPSecA pode ser utilizado para elicitar requisitos de segurança, gerar boas práticas de segurança (em termos de recomendações de implementação) e recomendar algoritmos criptográficos leves apropriados com base no contributo dos utilizadores. De igual forma, apresenta-se a avaliação do desempenho destes três componentes, demonstrando que o IoT-HarPSecA pode servir como um roteiro para o desenvolvimento seguro da IoT
    corecore