21 research outputs found

    Assisting Software Developers With License Compliance

    Get PDF
    Open source licensing determines how open source systems are reused, distributed, and modified from a legal perspective. While it facilitates rapid development, it can present difficulty for developers in understanding due to the legal language of these licenses. Because of misunderstandings, systems can incorporate licensed code in a way that violates the terms of the license. Such incompatibilities between licensing can result in the inability to reuse a particular library without either relicensing the system or redesigning the architecture of the system. Prior efforts have predominantly focused on license identification or understanding the underlying phenomena without reasoning about compatibility in a broad scale. The work in this dissertation first investigates the rationale of developers and identifies the areas that developers struggle with respect to free/open source software licensing. First, we investigate the diffusion of licenses and the prevalence of license changes in a large scale empirical study of 16,221 Java systems. We observed a clear lack of traceability and a lack of standardized licensing that led to difficulties and confusion for developers trying to reuse source code. We further investigated the difficulty by surveying the developers of the systems with license changes to understand why they first adopted a license and then changed licenses. Additionally, we performed an analysis on issue trackers and legal mailing lists to extract licensing bugs. From these works, we identified key areas in which developers struggled and needed support. While developers need support to identify license incompatibilities and understand both the cause and implications of the incompatibilities, we observed that state-of-the-art license identification tools did not identify license exceptions. Since these exceptions directly modify the license terms (either the permissions granted by the license or the restrictions imposed by the license), we proposed an approach to complement current license identification techniques in order to classify license exceptions. The approach relies on supervised machine learners to classify the licensing text to identify the particular license exceptions or the lack of a license exception. Subsequently, we built an infrastructure to assist developers with evaluating license compliance warnings for their system. The infrastructure evaluates compliance across the dependency tree of a system to ensure it is compliant with all of the licenses of the dependencies. When an incompatibility is present, it notes the specific library/libraries and the conflicting license(s) so that the developers can investigate these compliance warnings, which would prevent distribution of their software, in their system. We conduct a study on 121,094 open source projects spanning 6 programming languages, and we demonstrate that the infrastructure is able to identify license incompatibilities between these projects and their dependencies

    Open Source Law, Policy and Practice

    Get PDF
    This book examines various policies, including the legal and commercial aspects of the Open Source phenomenon. Here, ‘Open Source’ is adopted as convenient shorthand for a collection of diverse users and communities, whose differences can be as great as their similarities. The common thread is their reliance on, and use of, law and legal mechanisms to govern the source code they write, use, and distribute. The central fact of open source is that maintaining control over source code relies on the existence and efficacy of intellectual property (‘IP’) laws, particularly copyright law. Copyright law is the primary statutory tool that achieves the end of openness, although implemented through private law arrangements at varying points within the software supply chain. This dependent relationship is itself a cause of concern for some philosophically in favour of ‘open’, with some predicting (or hoping) that the free software movement will bring about the end of copyright as a means for protecting software

    Open Source Law, Policy and Practice

    Get PDF
    This book examines various policies, including the legal and commercial aspects of the Open Source phenomenon. Here, ‘Open Source’ is adopted as convenient shorthand for a collection of diverse users and communities, whose differences can be as great as their similarities. The common thread is their reliance on, and use of, law and legal mechanisms to govern the source code they write, use, and distribute. The central fact of open source is that maintaining control over source code relies on the existence and efficacy of intellectual property (‘IP’) laws, particularly copyright law. Copyright law is the primary statutory tool that achieves the end of openness, although implemented through private law arrangements at varying points within the software supply chain. This dependent relationship is itself a cause of concern for some philosophically in favour of ‘open’, with some predicting (or hoping) that the free software movement will bring about the end of copyright as a means for protecting software

    An Empirical Study of AI-based Smart Contract Creation

    Full text link
    The introduction of large language models (LLMs) like ChatGPT and Google Palm2 for smart contract generation seems to be the first well-established instance of an AI pair programmer. LLMs have access to a large number of open-source smart contracts, enabling them to utilize more extensive code in Solidity than other code generation tools. Although the initial and informal assessments of LLMs for smart contract generation are promising, a systematic evaluation is needed to explore the limits and benefits of these models. The main objective of this study is to assess the quality of generated code provided by LLMs for smart contracts. We also aim to evaluate the impact of the quality and variety of input parameters fed to LLMs. To achieve this aim, we created an experimental setup for evaluating the generated code in terms of validity, correctness, and efficiency. Our study finds crucial evidence of security bugs getting introduced in the generated smart contracts as well as the overall quality and correctness of the code getting impacted. However, we also identified the areas where it can be improved. The paper also proposes several potential research directions to improve the process, quality and safety of generated smart contract codes.Comment: Updated to address issue

    COOPERATIVE SIGNALING BEHAVIOR: SIGNALS FOR OPEN SOURCE PROJECT HEALTH

    Get PDF
    The core contribution is a critique of signaling theory from investigating cooperative signaling behavior in the context of organizational engagement with open source projects. Open source projects display signals of project health which are used by organizations. Projects and organizations engage in cooperative signaling behavior when they work together to create signals. Signaling theory is critiqued in the cooperative context of organizational engagements with open source projects by describing how cooperative signaling behavior occurs in three processes: identifying, evaluating, and filtering new signals. The contribution is informed through engaged field research and interviews, which are presented as a thick description of the CHAOSS Diversity & Inclusion Working Group and of how its community members create D&I signals. A contribution to literature on open source is a description of how signals for open source project health are created

    Oswaldo: A Semantic Web Enabled Approach for Identifying Open Source License Violations

    Get PDF
    Open source license violations are numerous, multifaceted, and pose significant risk to developers and companies in the form of litigation, sometimes resulting in millions in dollars in damages or settlements. Free/Libre and Open Source Licenses utilize copyright law and are written in legalese, which is often outside the scope of a developer’s expertise. Software Engineers commit violations of these licenses’ terms and conditions easily and often unknowingly. Consequently, increased knowledge, better tools, and sound processes to detect and prevent license violations are extremely important. This work is an investigation in the types of potential license violations that are committed, through direct and transitive dependency hierarchies in hundreds of thousands of real-world software projects. This thesis contributes a novel approach, entitled Oswaldo, that defines and detects three types of license conflicts: Type 1 Simple Violation, Type 2 Embedded Violations, Type 3 Compound Violations. Unidirectional compatibility/incompatibility relationships of major licenses are modelled. Ontologies and Linked Data are advantageously exploited to detect transitive violation Types 2 and 3, as well as the direct violation Type 1. This thesis also reports initial evaluations of these three types of license violations found in the Maven repository

    Towards Understanding Third-party Library Dependency in C/C++ Ecosystem

    Full text link
    Third-party libraries (TPLs) are frequently reused in software to reduce development cost and the time to market. However, external library dependencies may introduce vulnerabilities into host applications. The issue of library dependency has received considerable critical attention. Many package managers, such as Maven, Pip, and NPM, are proposed to manage TPLs. Moreover, a significant amount of effort has been put into studying dependencies in language ecosystems like Java, Python, and JavaScript except C/C++. Due to the lack of a unified package manager for C/C++, existing research has only few understanding of TPL dependencies in the C/C++ ecosystem, especially at large scale. Towards understanding TPL dependencies in the C/C++ecosystem, we collect existing TPL databases, package management tools, and dependency detection tools, summarize the dependency patterns of C/C++ projects, and construct a comprehensive and precise C/C++ dependency detector. Using our detector, we extract dependencies from a large-scale database containing 24K C/C++ repositories from GitHub. Based on the extracted dependencies, we provide the results and findings of an empirical study, which aims at understanding the characteristics of the TPL dependencies. We further discuss the implications to manage dependency for C/C++ and the future research directions for software engineering researchers and developers in fields of library development, software composition analysis, and C/C++package manager.Comment: ASE 202

    Isabelle/DOF. User and Implementation Manual

    Get PDF
    The software for which this is the manual is available via the DOI in this recordIsabelle/DOF provides an implementation of DOF on top of Isabelle/HOL. DOF itself is a novel framework for defining ontologies and enforcing them during document development and document evolution. Isabelle/DOF targets use-cases such as mathematical texts referring to a theory development or technical reports requiring a particular structure. A major application of DOF is the integrated development of formal certification documents (e.g., for Common Criteria or CENELEC 50128) that require consistency across both formal and informal arguments. Isabelle/DOF is integrated into Isabelle’s IDE, which allows for smooth ontology development as well as immediate ontological feedback during the editing of a document. Its checking facilities leverage the collaborative development of documents required to be consistent with an underlying ontological structure. In this user-manual, we give an in-depth presentation of the design concepts of DOF’s Ontology Definition Language (ODL) and describe comprehensively its major commands. Many examples show typical best-practice applications of the system. Isabelle/DOF is the first ontology language supporting machine-checked links between the formal and informal parts in an LCF-style interactive theorem proving environment.IRT System

    Avoimen lähdekoodin lisenssien ominaisuuksien vaikutukset ohjelmistoarkkitehtuuriin

    Get PDF
    Open source licenses enable software developers to co-operate with unknown developers to modify and redistribute software without direct fnancial costs to themselves. Detecting the actual licenses and copyright holders of open source components can be difficult and open source licenses can conflict with each other and can make profiting from open source difficult. Current license compliance methods do not take into account all open source license properties. Some developers are afraid to use open source, because they do not understand open source license properties or license management methods. In the OSSLI project current understanding of the different effects of open source license properties on software engineering was gathered by a systematic literature review. This thesis uses the results of the literature review, ontologies and general system theory to construct a framework to show how the properties of open source licenses affect software architecture. This OSSLI framework consists of the abstract legal system, procedural legal system, software architecture system, software engineering system, business system and social system. This thesis uses the OSSLI framework to evaluate current methods and tools to manage open source license issues and shows how the OSSLI framework was used in the research project to design a new tool to manage open source license compliance through software architecture. The OSSLI framework showed its utility in understanding the effects of open source license properties.Avoimen lähdekoodin lisenssien avulla ohjelmistokehittäjät voivat yhteistyössä toisilleen tuntemattomien kehittäjien kanssa jatkokehittää ja levittää ohjelmistoja maksamatta erillistä rahallista korvausta. Avoimen lähdekoodin lisenssit voivat kuitenkin olla vaikeaselkoisia ja haitata ohjelmiston hyödyntämistä kaupallisesti sekä eri lisenssien ominaisuudet voivat olla ristiriidassa keskenään. Nykyiset lisenssien hallintamenetelmät eivät ota huomioon kaikkia avoimen lähdekoodin lisenssien ominaisuuksia ja komponenttien todellisen tekijänoikeuksien varmistaminen voi olla vaikeaa. Kaikki ohjelmistokehittäjät eivät uskalla käyttää avointa lähdekoodia, koska eivät ymmärrä avoimen lähdekoodin lisenssien ominaisuuksia tai niiden hallintamenetelmiä. OSSLI-tukimusprojektissa kerättiin systemaattisen kirjallisuuskatsauksen avulla tietoa tieteellisen tutkimuksen nykyisestä käsityksestä avoimen lähdekoodin lisenssien vaikutuksista ohjelmistotuotantoon. Tämä diplomityö muodostaa kirjallisuuskatsauksen löydösten, ontologioiden ja yleisen systeemisteorian avulla kehyksen, jolla hahmotetaan avoimen lähdekoodin lisenssien ominaisuuksien vaikutuksista ohjelmistoarkkiehtuuriin. Tämä OSSLI-kehys rakentuu abstraktista ja sovelletusta laista, ohjelmistoarkkiehtuurista, ohjelmistokehityksestä, liiketoiminnasta ja sosiaalisesta verkostosta sekä huomioi myös lisenssien ominaisuudet. Diplomityössä arvioidaan OSSLI-kehyksen avulla avoimien lähdekoodien lisenssien riskien hallintaan käytettyjen työkaluja ja menetelmiä sekä kuvataan miten tutkimusprojektissa kehystä käytettiin uuden ohjelmistoarkkitehtuuritason lisenssienhallintatyökalun kehittämiseen. OSSLI-kehys osoitti hyödyllisyytensä avoimen lähdekoodin lisenssien ominaisuuksien vaikutusten ymmärtämiseen
    corecore