59 research outputs found

    Data Mining for Software Engineering

    Get PDF

    Strengthening the Synapse between Outpatient Neurological Care and Inpatient Referral

    Get PDF
    Objective The goal of this project is to investigate if established neurology patients are appropriately referred to the emergency room. We suspect that there are patients that could be more effectively triaged to prevent unnecessary visits to the emergency department. If this is the case, implementing an intervention such as offering expedited visits or contingency plan, may reduce non-emergent inpatient consultative services5. This would also improve outpatient communication and decrease utilization of both ER and patient resources

    Harvey: A Greybox Fuzzer for Smart Contracts

    Full text link
    We present Harvey, an industrial greybox fuzzer for smart contracts, which are programs managing accounts on a blockchain. Greybox fuzzing is a lightweight test-generation approach that effectively detects bugs and security vulnerabilities. However, greybox fuzzers randomly mutate program inputs to exercise new paths; this makes it challenging to cover code that is guarded by narrow checks, which are satisfied by no more than a few input values. Moreover, most real-world smart contracts transition through many different states during their lifetime, e.g., for every bid in an auction. To explore these states and thereby detect deep vulnerabilities, a greybox fuzzer would need to generate sequences of contract transactions, e.g., by creating bids from multiple users, while at the same time keeping the search space and test suite tractable. In this experience paper, we explain how Harvey alleviates both challenges with two key fuzzing techniques and distill the main lessons learned. First, Harvey extends standard greybox fuzzing with a method for predicting new inputs that are more likely to cover new paths or reveal vulnerabilities in smart contracts. Second, it fuzzes transaction sequences in a targeted and demand-driven way. We have evaluated our approach on 27 real-world contracts. Our experiments show that the underlying techniques significantly increase Harvey's effectiveness in achieving high coverage and detecting vulnerabilities, in most cases orders-of-magnitude faster; they also reveal new insights about contract code.Comment: arXiv admin note: substantial text overlap with arXiv:1807.0787

    Improving Software Productivity and Quality via Mining Source Code

    No full text
    The major goal of software development is to deliver high-quality software efficiently. To achieve this goal of delivering high-quality software efficiently, programmers often reuse existing frameworks or libraries, hereby referred to as libraries, instead of developing similar code artifacts from the scratch. However, programmers often face challenges in reusing existing libraries due to two major factors. First, many existing libraries are not well-documented. Even when such documentations exist, they are often outdated. Second, many existing libraries expose a large number of application programming interfaces (APIs), which represent interfaces through which libraries expose their functionalities. For example, the .NET base library provides nearly 10,000 API classes. Due to these two preceding factors, there exist three major problems that affect both software productivity and quality. First, programmers often spend more time in reusing existing libraries, thereby reducing software productivity. Second, programmers introduce defects while using APIs due to lack of proper knowledge on how to reuse those APIs. Third, existing white-box test generation techniques face challenges in effectively generating test inputs for the client code that reuses libraries. To address these three preceding issues, in this dissertation, we propose a general framework, called WebMiner, that uses existing open source code available on the web by leveraging a code search engine. In particular, WebMiner infers usage specifications for API methods under analysis by automatically collecting relevant code examples from the open source code available on the web. WebMiner next applies data mining techniques on those collected code examples to identify common patterns, which represent likely usage of APIs, referred to as API usage specifications. The primary reason for identifying common patterns is based on the observation that majority of the programmers correctly adhere to API usage specifications and those common patterns are likely to represent the correct usage of APIs. We further propose six approaches based on our general framework, where each approach focuses on a specific software engineering (SE) task such as detecting defects in an application under analysis. In particular, the first two approaches assist programmers in effectively reusing APIs provided by existing libraries. The next two approaches use mined API usage specifications as programming rules and detect defects in applications under analysis as deviations from the mined specifications. Finally, the last two approaches mine static and dynamic traces, respectively, for effectively generating test inputs that achieve high structural coverage of the code under test. We also propose another approach that addresses a major issue with mining-based approaches, which are not effective in scenarios where usage information is not available for the API methods under analysis or usage information is not sufficient to achieve the SE task under analysis. Our empirical results show that the approaches developed based on our WebMiner framework effectively address the respective SE tasks handled by those approaches. In particular, our empirical results demonstrate the effectiveness of expanding the data scope of mining-based approaches to large open source code available on the web. Our results also show that our approaches address queries posted in developer forums and detect new defects that are not detected by existing related approaches, thereby improving both software productivity and quality

    Mining exception-handling rules as sequence association rules

    No full text
    Programming languages such as Java and C++ provide exception-handling constructs to handle exception conditions. Applications are expected to handle these exception conditions and take necessary recovery actions such as releasing opened database connections. However, exceptionhandling rules that describe these necessary recovery actions are often not available in practice. To address this issue, we develop a novel approach that mines exceptionhandling rules as sequence association rules of the form “(FC 1 c...FC n c) ∧ FCa ⇒ (FC 1 e...FC m e)”. This rule describes that function call FCa should be followed by a sequence of function calls (FC 1 e...FC m e) when FCa is preceded by a sequence of function calls (FC 1 c...FC n c). Such form of rules is required to characterize common exceptionhandling rules. We show the usefulness of these mined rules by applying them on five real-world applications (including 285 KLOC) to detect violations in our evaluation. Our empirical results show that our approach mines 294 real exception-handling rules in these five applications and also detects 160 defects, where 87 defects are new defects that are not found by a previous related approach.
    corecore