Search CORE

2 research outputs found

Detecting Data Leakage from Databases on Android Apps with Concept Drift

Author: Chandola Varun
Kul Gokhan
Upadhyaya Shambhu
Publication venue
Publication date: 29/05/2018
Field of study

Mobile databases are the statutory backbones of many applications on smartphones, and they store a lot of sensitive information. However, vulnerabilities in the operating system or the app logic can lead to sensitive data leakage by giving the adversaries unauthorized access to the app's database. In this paper, we study such vulnerabilities to define a threat model, and we propose an OS-version independent protection mechanism that app developers can utilize to detect such attacks. To do so, we model the user behavior with the database query workload created by the original apps. Here, we model the drift in behavior by comparing probability distributions of the query workload features over time. We then use this model to determine if the app behavior drift is anomalous. We evaluate our framework on real-world workloads of three different popular Android apps, and we show that our system was able to detect more than 90% of such attacks.Comment: This paper is accepted to be published in the proceedings of IEEE TrustCom 201

arXiv.org e-Print Archive

Query Log Compression for Workload Analytics

Author: Chandola Varun
Kennedy Oliver
Xie Ting
Publication venue
Publication date: 29/09/2018
Field of study

Analyzing database access logs is a key part of performance tuning, intrusion detection, benchmark development, and many other database administration tasks. Unfortunately, it is common for production databases to deal with millions or even more queries each day, so these logs must be summarized before they can be used. Designing an appropriate summary encoding requires trading off between conciseness and information content. For example: simple workload sampling may miss rare, but high impact queries. In this paper, we present LogR, a lossy log compression scheme suitable use for many automated log analytics tools, as well as for human inspection. We formalize and analyze the space/fidelity trade-off in the context of a broader family of "pattern" and "pattern mixture" log encodings to which LogR belongs. We show through a series of experiments that LogR compressed encodings can be created efficiently, come with provable information-theoretic bounds on their accuracy, and outperform state-of-art log summarization strategies.Comment: Typos fixed, some irrelevant figures and paragraphs are trimme

arXiv.org e-Print Archive