12 research outputs found

    Event trace reduction for effective bug replay of Android apps via differential GUI state analysis

    Full text link
    Š 2019 ACM. Existing Android testing tools, such as Monkey, generate a large quantity and a wide variety of user events to expose latent GUI bugs in Android apps. However, even if a bug is found, a majority of the events thus generated are often redundant and bug-irrelevant. In addition, it is also time-consuming for developers to localize and replay the bug given a long and tedious event sequence (trace). This paper presents ECHO, an event trace reduction tool for effective bug replay by using a new differential GUI state analysis. Given a sequence of events (trace), ECHO aims at removing bug-irrelevant events by exploiting the differential behavior between the GUI states collected when their corresponding events are triggered. During dynamic testing, ECHO injects at most one lightweight inspection event after every event to collect its corresponding GUI state. A new adaptive model is proposed to selectively inject inspection events based on sliding windows to differentiate the GUI states on-the-fly in a single testing process. The experimental results show that ECHO improves the effectiveness of bug replay by removing 85.11% redundant events on average while also revealing the same bugs as those detected when full event sequences are used

    Automated, Cost-effective, and Update-driven App Testing

    Get PDF
    Apps' pervasive role in our society led to the definition of test automation approaches to ensure their dependability. However, state-of-the-art approaches tend to generate large numbers of test inputs and are unlikely to achieve more than 50% method coverage. In this paper, we propose a strategy to achieve significantly higher coverage of the code affected by updates with a much smaller number of test inputs, thus alleviating the test oracle problem. More specifically, we present ATUA, a model-based approach that synthesizes App models with static analysis, integrates a dynamically-refined state abstraction function and combines complementary testing strategies, including (1) coverage of the model structure, (2) coverage of the App code, (3) random exploration, and (4) coverage of dependencies identified through information retrieval. Its model-based strategy enables ATUA to generate a small set of inputs that exercise only the code affected by the updates. In turn, this makes common test oracle solutions more cost-effective as they tend to involve human effort. A large empirical evaluation, conducted with 72 App versions belonging to nine popular Android Apps, has shown that ATUA is more effective and less effort intensive than state-of-the-art approaches when testing App updates

    “And all the pieces matter...” Hybrid Testing Methods for Android App's Privacy Analysis

    Get PDF
    Smartphones have become inherent to the every day life of billions of people worldwide, and they are used to perform activities such as gaming, interacting with our peers or working. While extremely useful, smartphone apps also have drawbacks, as they can affect the security and privacy of users. Android devices hold a lot of personal data from users, including their social circles (e.g., contacts), usage patterns (e.g., app usage and visited websites) and their physical location. Like in most software products, Android apps often include third-party code (Software Development Kits or SDKs) to include functionality in the app without the need to develop it in-house. Android apps and third-party components embedded in them are often interested in accessing such data, as the online ecosystem is dominated by data-driven business models and revenue streams like advertising. The research community has developed many methods and techniques for analyzing the privacy and security risks of mobile apps, mostly relying on two techniques: static code analysis and dynamic runtime analysis. Static analysis analyzes the code and other resources of an app to detect potential app behaviors. While this makes static analysis easier to scale, it has other drawbacks such as missing app behaviors when developers obfuscate the app’s code to avoid scrutiny. Furthermore, since static analysis only shows potential app behavior, this needs to be confirmed as it can also report false positives due to dead or legacy code. Dynamic analysis analyzes the apps at runtime to provide actual evidence of their behavior. However, these techniques are harder to scale as they need to be run on an instrumented device to collect runtime data. Similarly, there is a need to stimulate the app, simulating real inputs to examine as many code-paths as possible. While there are some automatic techniques to generate synthetic inputs, they have been shown to be insufficient. In this thesis, we explore the benefits of combining static and dynamic analysis techniques to complement each other and reduce their limitations. While most previous work has often relied on using these techniques in isolation, we combine their strengths in different and novel ways that allow us to further study different privacy issues on the Android ecosystem. Namely, we demonstrate the potential of combining these complementary methods to study three inter-related issues: • A regulatory analysis of parental control apps. We use a novel methodology that relies on easy-to-scale static analysis techniques to pin-point potential privacy issues and violations of current legislation by Android apps and their embedded SDKs. We rely on the results from our static analysis to inform the way in which we manually exercise the apps, maximizing our ability to obtain real evidence of these misbehaviors. We study 46 publicly available apps and find instances of data collection and sharing without consent and insecure network transmissions containing personal data. We also see that these apps fail to properly disclose these practices in their privacy policy. • A security analysis of the unauthorized access to permission-protected data without user consent. We use a novel technique that combines the strengths of static and dynamic analysis, by first comparing the data sent by applications at runtime with the permissions granted to each app in order to find instances of potential unauthorized access to permission protected data. Once we have discovered the apps that are accessing personal data without permission, we statically analyze their code in order to discover covert- and side-channels used by apps and SDKs to circumvent the permission system. This methodology allows us to discover apps using the MAC address as a surrogate for location data, two SDKs using the external storage as a covert-channel to share unique identifiers and an app using picture metadata to gain unauthorized access to location data. • A novel SDK detection methodology that relies on obtaining signals observed both in the app’s code and static resources and during its runtime behavior. Then, we rely on a tree structure together with a confidence based system to accurately detect SDK presence without the need of any a priory knowledge and with the ability to discern whether a given SDK is part of legacy or dead code. We prove that this novel methodology can discover third-party SDKs with more accuracy than state-of-the-art tools both on a set of purpose-built ground-truth apps and on a dataset of 5k publicly available apps. With these three case studies, we are able to highlight the benefits of combining static and dynamic analysis techniques for the study of the privacy and security guarantees and risks of Android apps and third-party SDKs. The use of these techniques in isolation would not have allowed us to deeply investigate these privacy issues, as we would lack the ability to provide real evidence of potential breaches of legislation, to pin-point the specific way in which apps are leveraging cover and side channels to break Android’s permission system or we would be unable to adapt to an ever-changing ecosystem of Android third-party companies.The works presented in this thesis were partially funded within the framework of the following projects and grants: • European Union’s Horizon 2020 Innovation Action program (Grant Agreement No. 786741, SMOOTH Project and Grant Agreement No. 101021377, TRUST AWARE Project). • Spanish Government ODIO NºPID2019-111429RB-C21/PID2019-111429RBC22. • The Spanish Data Protection Agency (AEPD) • AppCensus Inc.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Srdjan Matic.- Secretario: Guillermo Suárez-Tangil.- Vocal: Ben Stoc

    The construction and applications of callback control flow graphs for event-driven and framework-based mobile apps

    Get PDF
    Mobile devices have become ubiquitous over the last years. Android, as the leading platform in the mobile ecosystem, have over 2.5 million apps published in Google Play Market. This enormous ecosystem creates a fierce competition between apps with similar functionality in which the low quality of apps has been shown to increase the churn rate considerably. Additionally, the complex event-driven, framework-based architecture that developers use to implement apps imposes several challenges and led to new varieties of code smells and bugs. There is a need for tools that assure the quality of apps such as program analysis and testing tools. One of the foundational challenges for developing these tools is the sequencing or ordering of callback methods invoked from external events (e.g. GUI events) and framework calls. Even for a small subset of callbacks, it has been shown that the current state-of-the-art tools fail to generate sequences of callbacks that match the runtime behavior of Android apps. This thesis explores the construction and applications of new representations and program analyses for event-driven, framework-based mobile applications, specifically Android apps. In Android, we observe that the changes of control flow between entry points are mostly handled by the framework using callbacks. These callbacks can be executed synchronously and asynchronously when an external event happens (e.g. a click event) or a framework call is made. In framework-based systems, method calls to the framework can invoke sequences of callbacks. With the high overhead introduced by libraries such as the Android framework, most current tools for the analysis of Android apps have opted to skip the analysis of these libraries. Thus, these analyses missed the correct order of callbacks for each callback invoked in framework calls. This thesis presents a new specification called Predicate Callback Summary (PCS) to summarize how library or API methods invoke callbacks. PCSs enable inter-procedural analysis for Android apps without the overhead of analyzing the whole framework and help developers understand how their code (callback methods) is executed in the framework. We show that our static analysis techniques to summarize PCSs is accurate and scalable, considering the complexity of the millions of lines of code in the Android framework. With PCSs summaries, we have information about the control flow of callbacks invoked in framework calls but lack information about how external events can execute callbacks. To integrate event-driven control flow behavior with control behavior generated from framework calls, we designed a novel program representation, namely Callback Control Flow Automata (CCFA). The design of CCFA is based on the Extended Finite State Machine (EFSM) model, which extends the Finite State Machine (FSM) by labeling transitions using information such as guards. In a CCFA, a state represents whether the execution path enters or exits a callback. The transition from one state to another represents the transfer of control flow between callbacks. We present an analysis to automatically construct CCFAs by combining two callback control flow representations developed from the previous research, namely, Window Transition Graphs (WTGs) and PCSs. To demonstrate the usefulness of our representation, we integrated CCFAs into two client analyses: a taint analysis using FLOWDROID, and a value-flow analysis that computes source and sink pairs of a program. Our evaluation shows that we can compute CCFAs efficiently and that CCFAs improved the callback coverages over WTGs. As a result of using CCFAs, we obtained 33 more true positive security leaks than FLOWDROID over a total of 55 apps we have run. With a low false positive rate, we found that 22.76\% of source-sink pairs we computed are located in different callbacks and that 31 out of 55 apps contain source-sink pairs spreading across components. In the last part of this thesis, we use the CCFAs to develop a new family of coverage criteria based on callback sequences for more effective testing Android apps. We present 2 studies to help us identify what types of callbacks are important when detecting bugs. With the help of the empirical results, we defined 3 coverage criteria based on callback sequences. Our evaluation shows that our coverage criteria are a more effective metric than statement and GUI-based event coverage to guide test input generation
    corecore