228 research outputs found

    Static Analysis in Practice

    Get PDF
    Static analysis tools search software looking for defects that may cause an application to deviate from its intended behavior. These include defects that compute incorrect values, cause runtime exceptions or crashes, expose applications to security vulnerabilities, or lead to performance degradation. In an ideal world, the analysis would precisely identify all possible defects. In reality, it is not always possible to infer the intent of a software component or code fragment, and static analysis tools sometimes output spurious warnings or miss important bugs. As a result, tool makers and researchers focus on developing heuristics and techniques to improve speed and accuracy. But, in practice, speed and accuracy are not sufficient to maximize the value received by software makers using static analysis. Software engineering teams need to make static analysis an effective part of their regular process. In this dissertation, I examine the ways static analysis is used in practice by commercial and open source users. I observe that effectiveness is hampered, not only by false warnings, but also by true defects that do not affect software behavior in practice. Indeed, mature production systems are often littered with true defects that do not prevent them from functioning, mostly correctly. To understand why this occurs, observe that developers inadvertently create both important and unimportant defects when they write software, but most quality assurance activities are directed at finding the important ones. By the time the system is mature, there may still be a few consequential defects that can be found by static analysis, but they are drowned out by the many true but low impact defects that were never fixed. An exception to this rule is certain classes of subtle security, performance, or concurrency defects that are hard to detect without static analysis. Software teams can use static analysis to find defects very early in the process, when they are cheapest to fix, and in so doing increase the effectiveness of later quality assurance activities. But this effort comes with costs that must be managed to ensure static analysis is worthwhile. The cost effectiveness of static analysis also depends on the nature of the defect being sought, the nature of the application, the infrastructure supporting tools, and the policies governing its use. Through this research, I interact with real users through surveys, interviews, lab studies, and community-wide reviews, to discover their perspectives and experiences, and to understand the costs and challenges incurred when adopting static analysis tools. I also analyze the defects found in real systems and make observations about which ones are fixed, why some seemingly serious defects persist, and what considerations static analysis tools and software teams should make to increase effectiveness. Ultimately, my interaction with real users confirms that static analysis is well received and useful in practice, but the right environment is needed to maximize its return on investment

    Android source code vulnerability detection: a systematic literature review

    Get PDF
    The use of mobile devices is rising daily in this technological era. A continuous and increasing number of mobile applications are constantly offered on mobile marketplaces to fulfil the needs of smartphone users. Many Android applications do not address the security aspects appropriately. This is often due to a lack of automated mechanisms to identify, test, and fix source code vulnerabilities at the early stages of design and development. Therefore, the need to fix such issues at the initial stages rather than providing updates and patches to the published applications is widely recognized. Researchers have proposed several methods to improve the security of applications by detecting source code vulnerabilities and malicious codes. This Systematic Literature Review (SLR) focuses on Android application analysis and source code vulnerability detection methods and tools by critically evaluating 118 carefully selected technical studies published between 2016 and 2022. It highlights the advantages, disadvantages, applicability of the proposed techniques and potential improvements of those studies. Both Machine Learning (ML) based methods and conventional methods related to vulnerability detection are discussed while focusing more on ML-based methods since many recent studies conducted experiments with ML. Therefore, this paper aims to enable researchers to acquire in-depth knowledge in secure mobile application development while minimizing the vulnerabilities by applying ML methods. Furthermore, researchers can use the discussions and findings of this SLR to identify potential future research and development directions

    17th Edition of ECOOP Doctoral Symposium and PhD Workshop : Proceedings

    Get PDF

    Dependability where the mobile world meets the enterprise world

    Get PDF
    As we move toward increasingly larger scales of computing, complexity of systems and networks has increased manifold leading to massive failures of cloud providers (Amazon Cloudfront, November 2014) and geographically localized outages of cellular services (T-Mobile, June 2014). In this dissertation, we investigate the dependability aspects of two of the most prevalent computing platforms today, namely, smartphones and cloud computing. These two seemingly disparate platforms are part of a cohesive story—they interact to provide end-to-end services which are increasingly being delivered over mobile platforms, examples being iCloud, Google Drive and their smartphone counterparts iPhone and Android. ^ In one of the early work on characterizing failures in dominant mobile OSes, we analyzed bug repositories of Android and Symbian and found similarities in their failure modes [ISSRE2010]. We also presented a classification of root causes and quantified the impact of ease of customizing the smartphones on system reliability. Our evaluation of Inter-Component Communication in Android [DSN2012] show an alarming number of exception handling errors where a phone may be crashed by passing it malformed component invocation messages, even from unprivileged applications. In this work, we also suggest language extensions that can mitigate these problems. ^ Mobile applications today are increasingly being used to interact with enterprise-class web services commonly hosted in virtualized environments. Virutalization suffers from the problem of imperfect performance isolation where contention for low-level hardware resources can impact application performance. Through a set of rigorous experiments in a private cloud testbed and in EC2, we show that interference induced performance degradation is a reality. Our experiments have also shown that optimal configuration settings for web servers change during such phases of interference. Based on this observation, we design and implement the IC 2engine which can mitigate effects of interference by reconfiguring web server parameters [MW2014]. We further improve IC 2 by incorporating it into a two-level configuration engine, named ICE, for managing web server clusters [ICAC2015]. Our evaluations show that, compared to an interference agnostic configuration, IC 2 can improve response time of web servers by upto 40%, while ICE can improve response time by up to 94% during phases of interference

    Automated refactoring for Java concurrency

    Get PDF
    In multicore era, programmers exploit concurrent programming to gain performance and responsiveness benefits. However, concurrent programs are difficult to write: the programmer has to balance two conflicting forces, thread safety and performance. To make concurrent programming easier, modern programming languages provide many kinds of concurrent constructs, such as threads, asynchronous tasks, concurrent collections, etc. However, despite the existence of these concurrent constructs, we know little about how developers use them. On the other hand, although existing API documentation teach developers how to use concurrent constructs, developers can still misuse and underuse them. In this dissertation, we study the use, misuse, and underuse of two types of commonly used Java concurrent constructs: Java concurrent collections and Android async constructs. Our studies show that even though concurrent constructs are widely used in practice, developers still misuse and underuse them, causing semantic and performance bugs. We propose and develop a refactoring toolset to help developers correctly use concurrent constructs. The toolset is composed of three automated refactorings: (i) detecting and fixing the misuses of Java concurrent collections, (ii) retro fitting concurrency for existing sequential Android code via a basic Android async construct, and (iii) converting inappropriately used basic Android async constructs to appropriately enhanced constructs for Android apps. Refactorings (i) and (iii) aim to fix misused constructs while refactoring (ii) aims to eliminate underuses. First, we cataloged nine commonly misused check-then-act idioms of Java concurrent collections, and show the correct usage of each idiom. We implemented the detection strategies in a tool, CTADetector, that finds and fi xes misused check-then-act idioms. We applied CTADetector to 28 widely used open source Java projects (comprising 6.4 million lines of code) that use Java concurrent collections. CTADetector discovered and fixed 60 bugs. These bugs were con firmed by developers and the fixes were accepted. Second, we conducted a formative study on how a basic Android async construct, AsyncTask, is used, misused, and underused in Android apps. Based on the study, we designed, developed, and evaluated Asynchronizer, an automated refactoring tool that enables developers to retrofit concurrency into Android apps. The refactoring uses a points-to static analysis to determine the safety of the refactoring. We applied Asynchronizer to perform 123 refactorings in 19 widely used Android apps; their developers accepted 40 refactorings in 7 projects. Third, we conducted a formative study on a corpus of 611 widely-used Android apps to map the asynchronous landscape of Android apps, understand how developers retrofi t concurrency in Android apps, and learn about barriers encountered by developers. Based on this study, we designed, implemented, and evaluated AsyncDroid, a refactoring tool which enables Android developers to transform existing improperly-used async constructs into correct constructs. We submitted 45 refactoring patches generated by AsyncDroid in 7 widely used Android projects, and developers accepted 15 of them. Finally, we released all tools as open-source plugins for the widely used Eclipse IDE which has millions of Java users. Moreover, we also integrated CTADetector and AsyncDroid with a static analysis platform, ShipShape, that is developed by Google. Google envisions ShipShape to become a widely-used platform. Any app developer that wants to check code quality, for example before submitting an app to the app store, would run ShipShape on her code base. We expect that by contributing new async analyzers to ShipShape, millions of app developers would bene t by being able to execute our analysis and transformations on their code

    Doctor of Philosophy

    Get PDF
    dissertationHigh Performance Computing (HPC) on-node parallelism is of extreme importance to guarantee and maintain scalability across large clusters of hundreds of thousands of multicore nodes. HPC programming is dominated by the hybrid model "MPI + X", with MPI to exploit the parallelism across the nodes, and "X" as some shared memory parallel programming model to accomplish multicore parallelism across CPUs or GPUs. OpenMP has become the "X" standard de-facto in HPC to exploit the multicore architectures of modern CPUs. Data races are one of the most common and insidious of concurrent errors in shared memory programming models and OpenMP programs are not immune to them. The OpenMP-provided ease of use to parallelizing programs can often make it error-prone to data races which become hard to find in large applications with thousands lines of code. Unfortunately, prior tools are unable to impact practice owing to their poor coverage or poor scalability. In this work, we develop several new approaches for low overhead data race detection. Our approaches aim to guarantee high precision and accuracy of race checking while maintaining a low runtime and memory overhead. We present two race checkers for C/C++ OpenMP programs that target two different classes of programs. The first, ARCHER, is fast but requires large amount of memory, so it ideally targets applications that require only a small portion of the available on-node memory. On the other hand, SWORD strikes a balance between fast zero memory overhead data collection followed by offline analysis that can take a long time, but it often report most races quickly. Given that race checking was impossible for large OpenMP applications, our contributions are the best available advances in what is known to be a difficult NP-complete problem. We performed an extensive evaluation of the tools on existing OpenMP programs and HPC benchmarks. Results show that both tools guarantee to identify all the races of a program in a given run without reporting any false alarms. The tools are user-friendly, hence serve as an important instrument for the daily work of programmers to help them identify data races early during development and production testing. Furthermore, our demonstrated success on real-world applications puts these tools on the top list of debugging tools for scientists at large

    Android source code vulnerability detection: a systematic literature review

    Get PDF
    The use of mobile devices is rising daily in this technological era. A continuous and increasing number of mobile applications are constantly offered on mobile marketplaces to fulfil the needs of smartphone users. Many Android applications do not address the security aspects appropriately. This is often due to a lack of automated mechanisms to identify, test, and fix source code vulnerabilities at the early stages of design and development. Therefore, the need to fix such issues at the initial stages rather than providing updates and patches to the published applications is widely recognized. Researchers have proposed several methods to improve the security of applications by detecting source code vulnerabilities and malicious codes. This Systematic Literature Review (SLR) focuses on Android application analysis and source code vulnerability detection methods and tools by critically evaluating 118 carefully selected technical studies published between 2016 and 2022. It highlights the advantages, disadvantages, applicability of the proposed techniques and potential improvements of those studies. Both Machine Learning (ML) based methods and conventional methods related to vulnerability detection are discussed while focusing more on ML-based methods since many recent studies conducted experiments with ML. Therefore, this paper aims to enable researchers to acquire in-depth knowledge in secure mobile application development while minimizing the vulnerabilities by applying ML methods. Furthermore, researchers can use the discussions and findings of this SLR to identify potential future research and development directions

    Towards understanding the challenges faced by machine learning software developers and enabling automated solutions

    Get PDF
    Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To fill that gap this thesis reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikitlearn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. Our findings reveal the urgent need for software engineering (SE) research in this area. The second part of the thesis particularly focuses on understanding the Deep Neural Network (DNN) bug characteristics. We study 2,716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, their root causes and impacts, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. While exploring the bug characteristics, our findings imply that repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. So, the third part of this thesis presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns and the most common bug fix patterns are fixing data dimension and neural network connectivity. Finally, we propose an automatic technique to detect ML Application Programming Interface (API) misuses. We started with an empirical study to understand ML API misuses. Our study shows that ML API misuse is prevalent and distinct compared to non-ML API misuses. Inspired by these findings, we contributed Amimla (Api Misuse In Machine Learning Apis) an approach and a tool for ML API misuse detection. Amimla relies on several technical innovations. First, we proposed an abstract representation of ML pipelines to use in misuse detection. Second, we proposed an abstract representation of neural networks for deep learning related APIs. Third, we have developed a representation strategy for constraints on ML APIs. Finally, we have developed a misuse detection strategy for both single and multi-APIs. Our experimental evaluation shows that Amimla achieves a high average accuracy of ∼80% on two benchmarks of misuses from Stack Overflow and Github

    The Medium of Visualization for Software Comprehension

    Get PDF
    Although abundant studies have shown how visualization can help software developers to understand software systems, visualization is still not a common practice since developers (i) have little support to find a proper visualization for their needs, and once they find a suitable visualization tool, they (ii) are unsure of its effectiveness. We aim to offer support for identifying proper visualizations, and to increase the effectiveness of visualization techniques. In this dissertation, we characterize proposed software visualizations. To fill the gap between proposed visualizations and their practical application, we encapsulate such characteristics in an ontology, and propose a meta-visualization approach to find suitable visualizations. Amongst others characteristics of software visualizations, we identify that the medium used to display them can be a means to increase the effectiveness of visualization techniques for particular comprehension tasks.We implement visualization prototypes and validate our thesis via experiments. We found that even though developers using a physical 3D model medium required the least time to deal with tasks that involve identifying outliers, they perceived the least difficulty when visualizing systems based on the standard computer screen medium. Moreover, developers using immersive virtual reality obtained the highest recollection. We conclude that the effectiveness of software visualizations that use the city metaphor to support comprehension tasks can be increased when city visualizations are rendered in an appropriate medium. Furthermore, that visualization of software visualizations can be a suitable means for exploring their multiple characteristics that can be properly encapsulated in an ontology

    Detecting Dissimilar Classes of Source Code Defects

    Get PDF
    Software maintenance accounts for the most part of the software development cost and efforts, with its major activities focused on the detection, location, analysis and removal of defects present in the software. Although software defects can be originated, and be present, at any phase of the software development life-cycle, implementation (i.e., source code) contains more than three-fourths of the total defects. Due to the diverse nature of the defects, their detection and analysis activities have to be carried out by equally diverse tools, often necessitating the application of multiple tools for reasonable defect coverage that directly increases maintenance overhead. Unified detection tools are known to combine different specialized techniques into a single and massive core, resulting in operational difficulty and maintenance cost increment. The objective of this research was to search for a technique that can detect dissimilar defects using a simplified model and a single methodology, both of which should contribute in creating an easy-to-acquire solution. Following this goal, a ‘Supervised Automation Framework’ named FlexTax was developed for semi-automatic defect mapping and taxonomy generation, which was then applied on a large-scale real-world defect dataset to generate a comprehensive Defect Taxonomy that was verified using machine learning classifiers and manual verification. This Taxonomy, along with an extensive literature survey, was used for comprehension of the properties of different classes of defects, and for developing Defect Similarity Metrics. The Taxonomy, and the Similarity Metrics were then used to develop a defect detection model and associated techniques, collectively named Symbolic Range Tuple Analysis, or SRTA. SRTA relies on Symbolic Analysis, Path Summarization and Range Propagation to detect dissimilar classes of defects using a simplified set of operations. To verify the effectiveness of the technique, SRTA was evaluated by processing multiple real-world open-source systems, by direct comparison with three state-of-the-art tools, by a controlled experiment, by using an established Benchmark, by comparison with other tools through secondary data, and by a large-scale fault-injection experiment conducted using a Mutation-Injection Framework, which relied on the taxonomy developed earlier for the definition of mutation rules. Experimental results confirmed SRTA’s practicality, generality, scalability and accuracy, and proved SRTA’s applicability as a new Defect Detection Technique
    • …
    corecore