2,686 research outputs found
Overcoming Language Dichotomies: Toward Effective Program Comprehension for Mobile App Development
Mobile devices and platforms have become an established target for modern
software developers due to performant hardware and a large and growing user
base numbering in the billions. Despite their popularity, the software
development process for mobile apps comes with a set of unique, domain-specific
challenges rooted in program comprehension. Many of these challenges stem from
developer difficulties in reasoning about different representations of a
program, a phenomenon we define as a "language dichotomy". In this paper, we
reflect upon the various language dichotomies that contribute to open problems
in program comprehension and development for mobile apps. Furthermore, to help
guide the research community towards effective solutions for these problems, we
provide a roadmap of directions for future work.Comment: Invited Keynote Paper for the 26th IEEE/ACM International Conference
on Program Comprehension (ICPC'18
An analysis of android malware classification services
The increasing number of Android malware forced antivirus (AV) companies to rely on automated classification techniques to determine the family and class of suspicious samples. The research community relies heavily on such labels to carry out prevalence studies of the threat ecosystem and to build datasets that are used to validate and benchmark novel detection and classification methods. In this work, we carry out an extensive study of the Android malware ecosystem by surveying white papers and reports from 6 key players in the industry, as well as 81 papers from 8 top security conferences, to understand how malware datasets are used by both. We, then, explore the limitations associated with the use of available malware classification services, namely VirusTotal (VT) engines, for determining the family of an Android sample. Using a dataset of 2.47 M Android malware samples, we find that the detection coverage of VT's AVs is generally very low, that the percentage of samples flagged by any 2 AV engines does not go beyond 52%, and that common families between any pair of AV engines is at best 29%. We rely on clustering to determine the extent to which different AV engine pairs agree upon which samples belong to the same family (regardless of the actual family name) and find that there are discrepancies that can introduce noise in automatic label unification schemes. We also observe the usage of generic labels and inconsistencies within the labels of top AV engines, suggesting that their efforts are directed towards accurate detection rather than classification. Our results contribute to a better understanding of the limitations of using Android malware family labels as supplied by common AV engines.This work has been supported by the “Ramon y Cajal” Fellowship RYC-2020-029401
Stack Overflow: A Code Laundering Platform?
Developers use Question and Answer (Q&A) websites to exchange knowledge and
expertise. Stack Overflow is a popular Q&A website where developers discuss
coding problems and share code examples. Although all Stack Overflow posts are
free to access, code examples on Stack Overflow are governed by the Creative
Commons Attribute-ShareAlike 3.0 Unported license that developers should obey
when reusing code from Stack Overflow or posting code to Stack Overflow. In
this paper, we conduct a case study with 399 Android apps, to investigate
whether developers respect license terms when reusing code from Stack Overflow
posts (and the other way around). We found 232 code snippets in 62 Android apps
from our dataset that were potentially reused from Stack Overflow, and 1,226
Stack Overflow posts containing code examples that are clones of code released
in 68 Android apps, suggesting that developers may have copied the code of
these apps to answer Stack Overflow questions. We investigated the licenses of
these pieces of code and observed 1,279 cases of potential license violations
(related to code posting to Stack overflow or code reuse from Stack overflow).
This paper aims to raise the awareness of the software engineering community
about potential unethical code reuse activities taking place on Q&A websites
like Stack Overflow.Comment: In proceedings of the 24th IEEE International Conference on Software
Analysis, Evolution, and Reengineering (SANER
Explainable AI for Android Malware Detection: Towards Understanding Why the Models Perform So Well?
Machine learning (ML)-based Android malware detection has been one of the
most popular research topics in the mobile security community. An increasing
number of research studies have demonstrated that machine learning is an
effective and promising approach for malware detection, and some works have
even claimed that their proposed models could achieve 99\% detection accuracy,
leaving little room for further improvement. However, numerous prior studies
have suggested that unrealistic experimental designs bring substantial biases,
resulting in over-optimistic performance in malware detection. Unlike previous
research that examined the detection performance of ML classifiers to locate
the causes, this study employs Explainable AI (XAI) approaches to explore what
ML-based models learned during the training process, inspecting and
interpreting why ML-based malware classifiers perform so well under unrealistic
experimental settings. We discover that temporal sample inconsistency in the
training dataset brings over-optimistic classification performance (up to 99\%
F1 score and accuracy). Importantly, our results indicate that ML models
classify malware based on temporal differences between malware and benign,
rather than the actual malicious behaviors. Our evaluation also confirms the
fact that unrealistic experimental designs lead to not only unrealistic
detection performance but also poor reliability, posing a significant obstacle
to real-world applications. These findings suggest that XAI approaches should
be used to help practitioners/researchers better understand how do AI/ML models
(i.e., malware detection) work -- not just focusing on accuracy improvement.Comment: Accepted by the 33rd IEEE International Symposium on Software
Reliability Engineering (ISSRE 2022
IoTSan: Fortifying the Safety of IoT Systems
Today's IoT systems include event-driven smart applications (apps) that
interact with sensors and actuators. A problem specific to IoT systems is that
buggy apps, unforeseen bad app interactions, or device/communication failures,
can cause unsafe and dangerous physical states. Detecting flaws that lead to
such states, requires a holistic view of installed apps, component devices,
their configurations, and more importantly, how they interact. In this paper,
we design IoTSan, a novel practical system that uses model checking as a
building block to reveal "interaction-level" flaws by identifying events that
can lead the system to unsafe states. In building IoTSan, we design novel
techniques tailored to IoT systems, to alleviate the state explosion associated
with model checking. IoTSan also automatically translates IoT apps into a
format amenable to model checking. Finally, to understand the root cause of a
detected vulnerability, we design an attribution mechanism to identify
problematic and potentially malicious apps. We evaluate IoTSan on the Samsung
SmartThings platform. From 76 manually configured systems, IoTSan detects 147
vulnerabilities. We also evaluate IoTSan with malicious SmartThings apps from a
previous effort. IoTSan detects the potential safety violations and also
effectively attributes these apps as malicious.Comment: Proc. of the 14th ACM CoNEXT, 201
Program analysis for android security and reliability
The recent, widespread growth and adoption of mobile devices have revolutionized the way users interact with technology. As mobile apps have become increasingly prevalent, concerns regarding their security and reliability have gained significant attention. The ever-expanding mobile app ecosystem presents unique challenges in ensuring the protection of user data and maintaining app robustness. This dissertation expands the field of program analysis with techniques and abstractions tailored explicitly to enhancing Android security and reliability. This research introduces approaches for addressing critical issues related to sensitive information leakage, device and user fingerprinting, mobile medical score calculators, as well as termination-induced data loss. Through a series of comprehensive studies and employing novel approaches that combine static and dynamic analysis, this work provides valuable insights and practical solutions to the aforementioned challenges. In summary, this dissertation makes the following contributions: (1) precise identifier leak tracking via a novel algebraic representation of leak signatures, (2) identifier processing graphs (IPGs), an abstraction for extracting and subverting user-based and device-based fingerprinting schemes, (3) interval-based verification of medical score calculator correctness, and (4) identifying potential data losses caused by app termination
Automated Mapping of Adaptive App GUIs from Phones to TVs
With the increasing interconnection of smart devices, users often desire to
adopt the same app on quite different devices for identical tasks, such as
watching the same movies on both their smartphones and TV.
However, the significant differences in screen size, aspect ratio, and
interaction styles make it challenging to adapt Graphical User Interfaces
(GUIs) across these devices.
Although there are millions of apps available on Google Play, only a few
thousand are designed to support smart TV displays.
Existing techniques to map a mobile app GUI to a TV either adopt a responsive
design, which struggles to bridge the substantial gap between phone and TV or
use mirror apps for improved video display, which requires hardware support and
extra engineering efforts.
Instead of developing another app for supporting TVs, we propose a
semi-automated approach to generate corresponding adaptive TV GUIs, given the
phone GUIs as the input.
Based on our empirical study of GUI pairs for TV and phone in existing apps,
we synthesize a list of rules for grouping and classifying phone GUIs,
converting them to TV GUIs, and generating dynamic TV layouts and source code
for the TV display.
Our tool is not only beneficial to developers but also to GUI designers, who
can further customize the generated GUIs for their TV app development.
An evaluation and user study demonstrate the accuracy of our generated GUIs
and the usefulness of our tool.Comment: 30 pages, 15 figure
- …