35,378 research outputs found

    Faults in Linux 2.6

    Get PDF
    In August 2011, Linux entered its third decade. Ten years before, Chou et al. published a study of faults found by applying a static analyzer to Linux versions 1.0 through 2.4.1. A major result of their work was that the drivers directory contained up to 7 times more of certain kinds of faults than other directories. This result inspired numerous efforts on improving the reliability of driver code. Today, Linux is used in a wider range of environments, provides a wider range of services, and has adopted a new development and release model. What has been the impact of these changes on code quality? To answer this question, we have transported Chou et al.'s experiments to all versions of Linux 2.6; released between 2003 and 2011. We find that Linux has more than doubled in size during this period, but the number of faults per line of code has been decreasing. Moreover, the fault rate of drivers is now below that of other directories, such as arch. These results can guide further development and research efforts for the decade to come. To allow updating these results as Linux evolves, we define our experimental protocol and make our checkers available

    Assessing Code Authorship: The Case of the Linux Kernel

    Get PDF
    Code authorship is a key information in large-scale open source systems. Among others, it allows maintainers to assess division of work and identify key collaborators. Interestingly, open-source communities lack guidelines on how to manage authorship. This could be mitigated by setting to build an empirical body of knowledge on how authorship-related measures evolve in successful open-source communities. Towards that direction, we perform a case study on the Linux kernel. Our results show that: (a) only a small portion of developers (26 %) makes significant contributions to the code base; (b) the distribution of the number of files per author is highly skewed --- a small group of top authors (3 %) is responsible for hundreds of files, while most authors (75 %) are responsible for at most 11 files; (c) most authors (62 %) have a specialist profile; (d) authors with a high number of co-authorship connections tend to collaborate with others with less connections.Comment: Accepted at 13th International Conference on Open Source Systems (OSS). 12 page

    Effort estimation of FLOSS projects: A study of the Linux kernel

    Get PDF
    This is the post-print version of the Article. The official published version can be accessed from the link below - Copyright @ 2011 SpringerEmpirical research on Free/Libre/Open Source Software (FLOSS) has shown that developers tend to cluster around two main roles: “core” contributors differ from “peripheral” developers in terms of a larger number of responsibilities and a higher productivity pattern. A further, cross-cutting characterization of developers could be achieved by associating developers with “time slots”, and different patterns of activity and effort could be associated to such slots. Such analysis, if replicated, could be used not only to compare different FLOSS communities, and to evaluate their stability and maturity, but also to determine within projects, how the effort is distributed in a given period, and to estimate future needs with respect to key points in the software life-cycle (e.g., major releases). This study analyses the activity patterns within the Linux kernel project, at first focusing on the overall distribution of effort and activity within weeks and days; then, dividing each day into three 8-hour time slots, and focusing on effort and activity around major releases. Such analyses have the objective of evaluating effort, productivity and types of activity globally and around major releases. They enable a comparison of these releases and patterns of effort and activities with traditional software products and processes, and in turn, the identification of company-driven projects (i.e., working mainly during office hours) among FLOSS endeavors. The results of this research show that, overall, the effort within the Linux kernel community is constant (albeit at different levels) throughout the week, signalling the need of updated estimation models, different from those used in traditional 9am–5pm, Monday to Friday commercial companies. It also becomes evident that the activity before a release is vastly different from after a release, and that the changes show an increase in code complexity in specific time slots (notably in the late night hours), which will later require additional maintenance efforts

    On Benchmarking Embedded Linux Flash File Systems

    Full text link
    Due to its attractive characteristics in terms of performance, weight and power consumption, NAND flash memory became the main non volatile memory (NVM) in embedded systems. Those NVMs also present some specific characteristics/constraints: good but asymmetric I/O performance, limited lifetime, write/erase granularity asymmetry, etc. Those peculiarities are either managed in hardware for flash disks (SSDs, SD cards, USB sticks, etc.) or in software for raw embedded flash chips. When managed in software, flash algorithms and structures are implemented in a specific flash file system (FFS). In this paper, we present a performance study of the most widely used FFSs in embedded Linux: JFFS2, UBIFS,and YAFFS. We show some very particular behaviors and large performance disparities for tested FFS operations such as mounting, copying, and searching file trees, compression, etc.Comment: Embed With Linux, Lorient : France (2012

    Empirical studies of open source evolution

    Get PDF
    Copyright @ 2008 Springer-VerlagThis chapter presents a sample of empirical studies of Open Source Software (OSS) evolution. According to these studies, the classical results from the studies of proprietary software evoltion, such as Lehman’s laws of software evolution, might need to be revised, if not fully, at least in part, to account for the OSS observations. The book chapter also summarises what appears to be the empirical status of each of Lehman’s laws with respect to OSS and highlights the threads to validity that frequently emerge in these empirical studies. The chapter also discusses related topics for further research

    Source File Set Search for Clone-and-Own Reuse Analysis

    Get PDF
    Clone-and-own approach is a natural way of source code reuse for software developers. To assess how known bugs and security vulnerabilities of a cloned component affect an application, developers and security analysts need to identify an original version of the component and understand how the cloned component is different from the original one. Although developers may record the original version information in a version control system and/or directory names, such information is often either unavailable or incomplete. In this research, we propose a code search method that takes as input a set of source files and extracts all the components including similar files from a software ecosystem (i.e., a collection of existing versions of software packages). Our method employs an efficient file similarity computation using b-bit minwise hashing technique. We use an aggregated file similarity for ranking components. To evaluate the effectiveness of this tool, we analyzed 75 cloned components in Firefox and Android source code. The tool took about two hours to report the original components from 10 million files in Debian GNU/Linux packages. Recall of the top-five components in the extracted lists is 0.907, while recall of a baseline using SHA-1 file hash is 0.773, according to the ground truth recorded in the source code repositories.Comment: 14th International Conference on Mining Software Repositorie
    • 

    corecore