322 research outputs found

    Empirical Notes on the Interaction Between Continuous Kernel Fuzzing and Development

    Full text link
    Fuzzing has been studied and applied ever since the 1990s. Automated and continuous fuzzing has recently been applied also to open source software projects, including the Linux and BSD kernels. This paper concentrates on the practical aspects of continuous kernel fuzzing in four open source kernels. According to the results, there are over 800 unresolved crashes reported for the four kernels by the syzkaller/syzbot framework. Many of these have been reported relatively long ago. Interestingly, fuzzing-induced bugs have been resolved in the BSD kernels more rapidly. Furthermore, assertions and debug checks, use-after-frees, and general protection faults account for the majority of bug types in the Linux kernel. About 23% of the fixed bugs in the Linux kernel have either went through code review or additional testing. Finally, only code churn provides a weak statistical signal for explaining the associated bug fixing times in the Linux kernel.Comment: The 4th IEEE International Workshop on Reliability and Security Data Analysis (RSDA), 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Berlin, IEE

    Got Issues? Who Cares About It? A Large Scale Investigation of Issue Trackers from GitHub

    Get PDF
    International audienceFeedback from software users constitutes a vital part in the evolution of software projects. By filing issue reports, users help identify and fix bugs, document software code, and enhance the software via feature requests. Many studies have explored issue reports, proposed approaches to enable the submission of higher-quality reports, and presented techniques to sort, categorize and leverage issues for software engineering needs. Who, however, cares about filing issues? What kind of issues are reported in issue trackers? What kind of correlation exist between issue reporting and the success of software projects? In this study, we address the need for answering such questions by performing an empirical study on a hundred thousands of open source projects. After filtering relevant trackers, the study used about 20,000 projects. We investigate and answer various research questions on the popularity and impact of issue trackers

    DANIEL: Towards Automated Bug Discovery By Black Box Test Case Generation & Recommendation

    Get PDF
    Finding and documenting bugs in software systems is an essential component of the software development process. A bug is defined as a series of steps that produces behavior which differs from the software specification and requirements. Finding steps to produce such behavior requires expert knowledge of the possible operations of the software in development as well as intuition and creativity. This thesis proposes the Directed Action Node Input Execution Language (DANIEL), a language that represents test cases as directed graphs, where each node represents an action, and possible input arguments for each action are represented along the incoming directed edges. With this representation, it is possible to form a union of all recorded test cases, making a combined directed graph which represents all of the paths of interaction with the developing software. This thesis demonstrates how DANIEL can generate prioritized test cases for a web form application, while also preserving workflow context. Using a graph built on Selenium test cases, we evaluate a random walk, a weighted walk, and model-weighted walks integrating logistic regression and XGBoost to compute the relevant probabilities. We find that the weighted walk discovers the most bugs while the model-weighted walk provides the most meaningful coverage

    When would this bug get reported?

    Get PDF
    Abstract—Not all bugs in software would be experienced and reported by end users right away: Some bugs manifest themselves quickly and may be reported by users a few days after they get into the code base; others manifest many months or even years later, and may only be experienced and reported by a small number of users. We refer to the period of time between the time when a bug is introduced into code and the time when it is reported by a user as bug reporting latency. Knowledge of bug reporting latencies has an implication on prioritization of bug fixing activities—bugs with low reporting latencies may be fixed earlier than those with high latencies to shift debugging resources towards bugs highly concerning users. To investigate bug reporting latencies, we analyze bugs from three Java software systems: AspectJ, Rhino, and Lucene. We extract bug reporting data from their version control repositories and bug tracking systems, identify bug locations based on bug fixes, and back-trace bug introducing time based on change histories of the buggy code. Also, we remove nonessential changes, and most importantly, recover root causes of bugs from their treatments/fixes. We then calculate the bug reporting latencies, and find that bugs have diverse reporting latencies. Based on the calculated reporting latencies and features we extract from bugs, we build classification models that can predict whether a bug would be reported early (within 30 days) or later, which may be helpful for prioritizing bug fixing activities. Our evaluation on the three software systems shows that our bug reporting latency prediction models could achieve an AUC (Area Under the Receiving Operating Characteristics Curve) of 70.869%. I

    The Software Vulnerability Ecosystem: Software Development In The Context Of Adversarial Behavior

    Get PDF
    Software vulnerabilities are the root cause of many computer system security fail- ures. This dissertation addresses software vulnerabilities in the context of a software lifecycle, with a particular focus on three stages: (1) improving software quality dur- ing development; (2) pre- release bug discovery and repair; and (3) revising software as vulnerabilities are found. The question I pose regarding software quality during development is whether long-standing software engineering principles and practices such as code reuse help or hurt with respect to vulnerabilities. Using a novel data-driven analysis of large databases of vulnerabilities, I show the surprising result that software quality and software security are distinct. Most notably, the analysis uncovered a counterintu- itive phenomenon, namely that newly introduced software enjoys a period with no vulnerability discoveries, and further that this “Honeymoon Effect” (a term I coined) is well-explained by the unfamiliarity of the code to malicious actors. An important consequence for code reuse, intended to raise software quality, is that protections inherent in delays in vulnerability discovery from new code are reduced. The second question I pose is the predictive power of this effect. My experimental design exploited a large-scale open source software system, Mozilla Firefox, in which two development methodologies are pursued in parallel, making that the sole variable in outcomes. Comparing the methodologies using a novel synthesis of data from vulnerability databases, These results suggest that the rapid-release cycles used in agile software development (in which new software is introduced frequently) have a vulnerability discovery rate equivalent to conventional development. Finally, I pose the question of the relationship between the intrinsic security of software, stemming from design and development, and the ecosystem into which the software is embedded and in which it operates. I use the early development lifecycle to examine this question, and again use vulnerability data as the means of answering it. Defect discovery rates should decrease in a purely intrinsic model, with software maturity making vulnerabilities increasingly rare. The data, which show that vulnerability rates increase after a delay, contradict this. Software security therefore must be modeled including extrinsic factors, thus comprising an ecosystem

    Supporting Development Decisions with Software Analytics

    Get PDF
    Software practitioners make technical and business decisions based on the understanding they have of their software systems. This understanding is grounded in their own experiences, but can be augmented by studying various kinds of development artifacts, including source code, bug reports, version control meta-data, test cases, usage logs, etc. Unfortunately, the information contained in these artifacts is typically not organized in the way that is immediately useful to developers’ everyday decision making needs. To handle the large volumes of data, many practitioners and researchers have turned to analytics — that is, the use of analysis, data, and systematic reasoning for making decisions. The thesis of this dissertation is that by employing software analytics to various development tasks and activities, we can provide software practitioners better insights into their processes, systems, products, and users, to help them make more informed data-driven decisions. While quantitative analytics can help project managers understand the big picture of their systems, plan for its future, and monitor trends, qualitative analytics can enable developers to perform their daily tasks and activities more quickly by helping them better manage high volumes of information. To support this thesis, we provide three different examples of employing software analytics. First, we show how analysis of real-world usage data can be used to assess user dynamic behaviour and adoption trends of a software system by revealing valuable information on how software systems are used in practice. Second, we have created a lifecycle model that synthesizes knowledge from software development artifacts, such as reported issues, source code, discussions, community contributions, etc. Lifecycle models capture the dynamic nature of how various development artifacts change over time in an annotated graphical form that can be easily understood and communicated. We demonstrate how lifecycle models can be generated and present industrial case studies where we apply these models to assess the code review process of three different projects. Third, we present a developer-centric approach to issue tracking that aims to reduce information overload and improve developers’ situational awareness. Our approach is motivated by a grounded theory study of developer interviews, which suggests that customized views of a project’s repositories that are tailored to developer-specific tasks can help developers better track their progress and understand the surrounding technical context of their working environments. We have created a model of the kinds of information elements that developers feel are essential in completing their daily tasks, and from this model we have developed a prototype tool organized around developer-specific customized dashboards. The results of these three studies show that software analytics can inform evidence-based decisions related to user adoption of a software project, code review processes, and improved developers’ awareness on their daily tasks and activities

    Determinants of Success of the Open Source Selective Revealing Strategy: Solution Knowledge Emergence

    Get PDF
    Recent research suggests that firms may be able to create a competitive advantage by deliberately revealing specific problem knowledge beyond firm boundaries to open source meta-organisations such that new solution knowledge is created that benefits the focal firm more than its competitors (Alexy, George, & Salter, 2013). Yet, not all firms that use knowledge revealing strategies are successful in inducing the emergence of solution knowledge. The extant literature has as of yet not explained this heterogeneity in success of knowledge revealing strategies. Using a longitudinal database spanning the period from 1998 to end 2012 with more than 2 billion data points that was obtained from the Mozilla Foundation, one of the top open source meta-organisations, this dissertation identifies and measures the antecedent factors affecting successful solution knowledge emergence. The results reveal 35 antecedent factors that affect solution knowledge emergence in different ways across three levels of analysis. The numerous contributions to theory and practice that follow from the results are discussed

    Survival in the e-conomy: 2nd Australian information warfare & security conference 2001

    Get PDF
    This is an international conference for academics and industry specialists in information warfare, security, and other related fields. The conference has drawn participants from national and international organisations
    • …
    corecore