9 research outputs found

    Evaluating static analysis defect warnings on production software

    Full text link

    Resource flow simulation model

    Get PDF

    Integrating Defect Data, Code Review Data, and Version Control Data for Defect Analysis and Prediction

    Get PDF
    In this thesis, we present a new approach to integrating software system defect data: Defect reports, code reviews and code commits. We propose to infer defect types by keywords. We index defect reports into groups by the keywords found in the descriptions of those reports, and study the properties of each group by leveraging code reviews and code commits. Our approach is more scalable than previous studies that consider defects classified by manual inspections, because indexing is automatic and can be applied uniformly to large defect dataset. Also our approach can analyze defects from programming errors, performance issues, high-level design to user interface, a more comprehensive variety than previous studies using static program analysis. By applying our approach to Honeywell Automation and Control Solutions (ACS) projects, with roughly 700 defects, we found that some defect types could be five times more than other defect types, which gave clues to the dominant root causes of the defects. We found certain defect types clustered in certain source files. We found that 20%-50% of the files usually contained more than 80% of the defects. Finally, we applied a known defect prediction algorithm to predict the hot files of the defects for the defect types of interest. We achieved defect hit rate 50%-90%

    How to Write System-specific, Static Checkers in Metal

    No full text
    This paper gives an overview of the metal language, which we have designed to make it easy to construct system-specific, static analyses. We call these analyses extensions because they act as the input to a generic analysis engine that runs the static analysis over a given source base. We also interchangeably refer to them as checkers because they check that a user-specified property holds in the source base and report any violations of that property. Note that checkers may not detect all violations of a specified property. Their goal is to find as many violations as possible with a minimum of false positive

    How to write system-specific, static checkers in metal

    No full text

    Automatic detection of safety and security vulnerabilities in open source software

    Get PDF
    Growing software quality requirements have raised the stakes on software safety and security. Building secure software focuses on techniques and methodologies of design and implementation in order to avoid exploitable vulnerabilities. Unfortunately, coding errors have become common with the inexorable growth tendency of software size and complexity. According to the US National Institute of Standards and Technology (NIST), these coding errors lead to vulnerabilities that cost the US economy $60 billion each year. Therefore, tracking security and safety errors is considered as a fundamental cornerstone to deliver software that are free from severe vulnerabilities. The main objective of this thesis is the elaboration of efficient, rigorous, and practical techniques for the safety and security evaluation of source code. To tackle safety errors related to the misuse of type and memory operations, we present a novel type and effect discipline that extends the standard C type system with safety annotations and static safety checks. We define an inter-procedural, flow-sensitive, and alias-sensitive inference algorithm that automatically propagates type annotations and applies safety checks to programs without programmers' interaction. Moreover, we present a dynamic semantics of our C core language that is compliant with the ANSI C standard. We prove the consistency of the static semantics with respect to the dynamic semantics. We show the soundness of our static analysis in detecting our targeted set of safety errors. To tackle system-specific security properties, we present a security verification framework that combines static analysis and model-checking. We base our approach on the GCC compiler and its GIMPLE representation of source code to extract model-checkable abstractions of programs. For the verification process, we use an off-the-shelf pushdown system model-checker, and turn it into a fully-fledged security verification framework. We also allow programmers to define a wide range of security properties using an automata-based specification approach. To demonstrate the efficiency and the scalability of our approach, we conduct extensive experiments and case studies on large scale open-source software to verify their compliance with a representative set of the CERT standard secure coding rules

    Programming Language Evolution and Source Code Rejuvenation

    Get PDF
    Programmers rely on programming idioms, design patterns, and workaround techniques to express fundamental design not directly supported by the language. Evolving languages often address frequently encountered problems by adding language and library support to subsequent releases. By using new features, programmers can express their intent more directly. As new concerns, such as parallelism or security, arise, early idioms and language facilities can become serious liabilities. Modern code sometimes bene fits from optimization techniques not feasible for code that uses less expressive constructs. Manual source code migration is expensive, time-consuming, and prone to errors. This dissertation discusses the introduction of new language features and libraries, exemplifi ed by open-methods and a non-blocking growable array library. We describe the relationship of open-methods to various alternative implementation techniques. The benefi ts of open-methods materialize in simpler code, better performance, and similar memory footprint when compared to using alternative implementation techniques. Based on these findings, we develop the notion of source code rejuvenation, the automated migration of legacy code. Source code rejuvenation leverages enhanced program language and library facilities by finding and replacing coding patterns that can be expressed through higher-level software abstractions. Raising the level of abstraction improves code quality by lowering software entropy. In conjunction with extensions to programming languages, source code rejuvenation o ers an evolutionary trajectory towards more reliable, more secure, and better performing code. We describe the tools that allow us efficient implementations of code rejuvenations. The Pivot source-to-source translation infrastructure and its traversal mechanism forms the core of our machinery. In order to free programmers from representation details, we use a light-weight pattern matching generator that turns a C like input language into pattern matching code. The generated code integrates seamlessly with the rest of the analysis framework. We utilize the framework to build analysis systems that find common workaround techniques for designated language extensions of C 0x (e.g., initializer lists). Moreover, we describe a novel system (TACE | template analysis and concept extraction) for the analysis of uninstantiated template code. Our tool automatically extracts requirements from the body of template functions. TACE helps programmers understand the requirements that their code de facto imposes on arguments and compare those de facto requirements to formal and informal specifications
    corecore