1,564 research outputs found

    Assisting Software Developers With License Compliance

    Get PDF
    Open source licensing determines how open source systems are reused, distributed, and modified from a legal perspective. While it facilitates rapid development, it can present difficulty for developers in understanding due to the legal language of these licenses. Because of misunderstandings, systems can incorporate licensed code in a way that violates the terms of the license. Such incompatibilities between licensing can result in the inability to reuse a particular library without either relicensing the system or redesigning the architecture of the system. Prior efforts have predominantly focused on license identification or understanding the underlying phenomena without reasoning about compatibility in a broad scale. The work in this dissertation first investigates the rationale of developers and identifies the areas that developers struggle with respect to free/open source software licensing. First, we investigate the diffusion of licenses and the prevalence of license changes in a large scale empirical study of 16,221 Java systems. We observed a clear lack of traceability and a lack of standardized licensing that led to difficulties and confusion for developers trying to reuse source code. We further investigated the difficulty by surveying the developers of the systems with license changes to understand why they first adopted a license and then changed licenses. Additionally, we performed an analysis on issue trackers and legal mailing lists to extract licensing bugs. From these works, we identified key areas in which developers struggled and needed support. While developers need support to identify license incompatibilities and understand both the cause and implications of the incompatibilities, we observed that state-of-the-art license identification tools did not identify license exceptions. Since these exceptions directly modify the license terms (either the permissions granted by the license or the restrictions imposed by the license), we proposed an approach to complement current license identification techniques in order to classify license exceptions. The approach relies on supervised machine learners to classify the licensing text to identify the particular license exceptions or the lack of a license exception. Subsequently, we built an infrastructure to assist developers with evaluating license compliance warnings for their system. The infrastructure evaluates compliance across the dependency tree of a system to ensure it is compliant with all of the licenses of the dependencies. When an incompatibility is present, it notes the specific library/libraries and the conflicting license(s) so that the developers can investigate these compliance warnings, which would prevent distribution of their software, in their system. We conduct a study on 121,094 open source projects spanning 6 programming languages, and we demonstrate that the infrastructure is able to identify license incompatibilities between these projects and their dependencies

    On the Detection of Licenses Violations in the Android Ecosystem

    Get PDF
    RÉSUMÉ Très souvent, les développeurs d’applications mobiles réutilisent les bibliothèques et les composants déjà existants dans le but de réduire les coûts de développements. Cependant, ces bibliothèques et composants sont régies par des licences auxquelles les développeurs doivent se soumettre. Une licence contrôle la manière dont une bibliothèque ou un bout de code pourraient être réutilisés, modifiés ou redistribués. Une licence peut être vu comme étant une liste de règles que les développeurs doivent respecter avant d’utiliser le composant. Le non-respect des termes d’une licence pourrait engendrer des pénalités et des amendes. A travers ce mémoire de maîtrise, nous proposons une méthode d’identification des licences utilisées dans une application à code source ouvert. A l’aîde de cette méthode, nous menons une étude pour identifier les licences utilisées dans 857 applications mobiles, provenant du marché “F-Droid”, dans le but de comprendre les types de licences les plus souvent utilisées par les développeurs ainsi que la manière avec laquelle ces licenses évoluent à travers le temps. Nous menons notre étude sur deux niveaux; le niveau du projet et celui du fichier. Nous investigons également les infractions portées aux licences et leursévolutions à travers le temps; Nous comparons les licences déclarées au niveau du project avec celles de ses fichiers, des fichiers entre eux et des projets et fichiers avec ceux des bibliothèques utilisées par le projet, afin d’identifier des licences incompatibles utilisées dans un même projet. Les résultats montrent que les licences les plus utilisées sont les licences “GPL" et “Apache"; aussi bien au niveau du projet qu’au niveau fichiers. Nous remarquons que, dans plusieurs cas, les développeurs ne portent pas assez attention aux licences de leurs code source. Des 8 938 versions d’applications analysées, 3 250 versions ne sont pas accompagnées d’informations relatives aux licences. Concernant l’évolution des licences, nous remarquons que la probabilité pour un projet de demeurer sous une même licence est très élevée (95% en moyenne), et dans le cas d’un changement de license, le changement se fait généralement vers des licences plus permissives. Au niveau du fichier, nous avons remarqué que les développeurs ont tendance à retarder leur choix de licence. Dans 15% des changements de license, les développeurs retirent les informations relatives aux licences. Parmi les 857 projets analysés, nous avons identifier 15 projets contenant des infractions concernant les licences. 7 de ces projets contenaient encore des infractions dans leur version finale. Dans les autres cas, pour résoudre les infractions, les dévloppeurs ont changés les licences liés à quelques fichiers de l’application; ou ont retirés les fichiers problématiques des applications. En moyenne, 19 versions de l’application étaient nécessaires pour résoudre les infractions portées aux licences. Ces résultats sont une indication que les développeurs ont de la difficulté à comprendre les contraintes légales des termes des licences. Une autre explication est que le manque de cohérence et d’uniformisation des déclarations des licences créent une confusion chez les développeurs. Notre méthode de détection des licences pourrait être appliqué par les développeurs afin de traquer les infractions portées aux licences dans leurs projets avant la mise en marché.----------ABSTRACT Mobile applications (apps) developers often reuse code from existing libraries and frameworks in order to reduce development costs. However, these libraries and frameworks are governed by licenses to which developers must comply. A license governs the way in which a library or chunk of code can be reused, modified or redistributed. It can be seen as a list of rules that developers must respect before using the component. A failure to comply with a license is likely to result in penalties and fines. In this thesis, we propose our approach for license identification in open source applications. By applying this approach, we conduct a case study to identify licenses in 857 mobile apps from the F-droid market with the aim to understand the types of licenses that are most used by developers and how these licenses evolve overtime. We conduct our study both at project level and file level. We also investigates licenses violations and the evolution of these violations overtime; we compare licenses declared at the project level, file level and those of the libraries used by a project to seek for licenses that are incompatible and used in the same project. Results show that most used Licenses are GPL and Apache licenses both at the project level and file level. In many cases we noticed that developers didn’t pay too much attention to license their source code. For 3,250 apps releases out of 8,938 releases, the apps were distributed without licenses information. Regarding license evolution, we noticed that the probability for a project to stay under the same license is very high (95% in average) and in case of change, changes are generally toward more permissive licenses. At the file level, we noticed that developers tend to delay their decision about license selection, also in 15% of license changes, developers removed licensed information. We identified 15 projects out of 857 projects, with a license violation; 7 projects had violations in their final release. To solve license violations, developers either changed the license of some of the apps’ files or removed the contentious files from the apps. It took in average 19 releases to solve a license violation. These findings suggest that developers of mobile apps may be having some difficulties in understanding the legal constraint of licenses’ terms or it may be that the lack of consistency and standardization in license declarations fosters confusion among developers. Our license detection approach can be used by developers to track license violations in their projects

    Using the Uniqueness of Global Identifiers to Determine the Provenance of Python Software Source Code

    Full text link
    We consider the problem of identifying the provenance of free/open source software (FOSS) and specifically the need of identifying where reused source code has been copied from. We propose a lightweight approach to solve the problem based on software identifiers-such as the names of variables, classes, and functions chosen by programmers. The proposed approach is able to efficiently narrow down to a small set of candidate origin products, to be further analyzed with more expensive techniques to make a final provenance determination.By analyzing the PyPI (Python Packaging Index) open source ecosystem we find that globally defined identifiers are very distinct. Across PyPI's 244 K packages we found 11.2 M different global identifiers (classes and method/function names-with only 0.6% of identifiers shared among the two types of entities); 76% of identifiers were used only in one package, and 93% in at most 3. Randomly selecting 3 non-frequent global identifiers from an input product is enough to narrow down its origins to a maximum of 3 products within 89% of the cases.We validate the proposed approach by mapping Debian source packages implemented in Python to the corresponding PyPI packages; this approach uses at most five trials, where each trial uses three randomly chosen global identifiers from a randomly chosen python file of the subject software package, then ranks results using a popularity index and requires to inspect only the top result. In our experiments, this method is effective at finding the true origin of a project with a recall of 0.9 and precision of 0.77

    Commonsense Solutions: How State Laws Can Reduce Gun Deaths Associated with Mental Illness

    Get PDF
    Guns in the hands of the dangerously mentally ill have taken the lives of too many people. Mass shootings, like the shooting in a parking lot in Tucson, Arizona in January 2011, and the shooting in a movie theater in Aurora, Colorado in July 2012, have brought this problem to the attention of the American public. Experts have objected to the media's emphasis on mentally ill mass shooters, because mental illness is not the cause of most forms of gun violence toward others. Nevertheless, mental illness certainly plays a role in this violence, as the recent surge in mass shootings demonstrates. In fact, mental illness plays an even greater role in gun suicides, many of which could be averted if guns were temporarily removed from the situation. Existing state laws do not do enough to remove access to guns from dangerously mentally ill people. This report provides a series of proposals that state legislators should consider to address this problem and save lives

    A Guide to Distributed Digital Preservation

    Get PDF
    This volume is devoted to the broad topic of distributed digital preservation, a still-emerging field of practice for the cultural memory arena. Replication and distribution hold out the promise of indefinite preservation of materials without degradation, but establishing effective organizational and technical processes to enable this form of digital preservation is daunting. Institutions need practical examples of how this task can be accomplished in manageable, low-cost ways."--P. [4] of cove

    Creative commons for educators and librarians

    Get PDF
    Contents: What is Creative Commons? -- Copyright Law -- Anatomy of a Creative Commons License -- Using Creative Commons Licenses and Creative Commons-Licensed Works -- Creative Commons for Librarians and EducatorsAbstract: "The authoritative source for learning about using creative commons licenses and advocating for their use in your academic community"-- Provided by publisher

    Webová aplikace pro analýzu a vizualizaci legislativního procesu postavená na principech Linked Data

    Get PDF
    Cílem této práce je vytvořit sémantickou databázi návrhů zákonů schvalovaných v českém parlamentu. Pro popis legislativního procesu je vytvořena ontologie popisující jednotlivé kroky schvalování a stavy návrhů zákonů. Databáze i ontologie používá RDF jako nástroj pro sémantizaci dat poskytovaných Poslaneckou sněmovnou. Práce využívá existující ontologie, např. FRBR, pro reprezentaci dat s užším významem. Výsledná databáze bude prezentována pomocí open-source nástroje pro správu Linked Data, Payola, do kterého budou vytvořeny pluginy pro vizualizaci dat pomocí technologií HTML5. Powered by TCPDF (www.tcpdf.org)The aim of this thesis is to create a new dataset containing bills being passed in the Czech Parliament. It creates an ontology describing the legislative process and the individual stages of passing the bill. Both the dataset and ontology will use RDF to semanticize public exports provided by the Czech Chamber of Deputies. To create the ontology, it is attempted to specialize existing ontologies such as FRBR to describe data with a narrower domain. Resources representing bills are linked to other ontologies to connect the dataset into the Linked Data cloud. For practical presentation of the created dataset, new visualization plugins are programmed into Payola, an open-source Linked Data management tool, using HTML5 technologies. Powered by TCPDF (www.tcpdf.org)Department of Software EngineeringKatedra softwarového inženýrstvíMatematicko-fyzikální fakultaFaculty of Mathematics and Physic

    Assessment and Evaluation of the E-ARK Pilots and Tools

    Get PDF
    The E-ARK Project focuses on harmonizing currently fragmented solutions that support Archives services, especially in regard to Ingest, Archival Preservation and Dissemination of information. E-ARK solutions were tested in a series of open pilots in various national contexts, using both existing and near-to-market tools, as well as services developed by partners. This report provides the final assessment and evaluation of the pilots. Moreover, this report also details the technical evaluation of the tools developed within the project
    • …
    corecore