7,591 research outputs found
Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild
In this paper, we seek to better understand Android obfuscation and depict a
holistic view of the usage of obfuscation through a large-scale investigation
in the wild. In particular, we focus on four popular obfuscation approaches:
identifier renaming, string encryption, Java reflection, and packing. To obtain
the meaningful statistical results, we designed efficient and lightweight
detection models for each obfuscation technique and applied them to our massive
APK datasets (collected from Google Play, multiple third-party markets, and
malware databases). We have learned several interesting facts from the result.
For example, malware authors use string encryption more frequently, and more
apps on third-party markets than Google Play are packed. We are also interested
in the explanation of each finding. Therefore we carry out in-depth code
analysis on some Android apps after sampling. We believe our study will help
developers select the most suitable obfuscation approach, and in the meantime
help researchers improve code analysis systems in the right direction
Dublin Smart City Data Integration, Analysis and Visualisation
Data is an important resource for any organisation, to understand the in-depth working and identifying the unseen trends with in the data. When this data is efficiently processed and analysed it helps the authorities to take appropriate decisions based on the derived insights and knowledge, through these decisions the service quality can be improved and enhance the customer experience. A massive growth in the data generation has been observed since two decades. The significant part of this generated data is generated from the dumb and smart sensors. If this raw data is processed in an efficient manner it could uplift the quality levels towards areas such as data mining, data analytics, business intelligence and data visualisation
Operationalizing and automating data governance
The ability to cross data from multiple sources represents a competitive advantage for organizations. Yet, the governance of the data lifecycle, from the data sources into valuable insights, is largely performed in an ad-hoc or manual manner. This is specifically concerning in scenarios where tens or hundreds of continuously evolving data sources produce semi-structured data. To overcome this challenge, we develop a framework for operationalizing and automating data governance. For the first, we propose a zoned data lake architecture and a set of data governance processes that allow the systematic ingestion, transformation and integration of data from heterogeneous sources, in order to make them readily available for business users. For the second, we propose a set of metadata artifacts that allow the automatic execution of data governance processes, addressing a wide range of data management challenges. We showcase the usefulness of the proposed approach using a real world use case, stemming from the collaborative project with the World Health Organization for the management and analysis of data about Neglected Tropical Diseases. Overall, this work contributes on facilitating organizations the adoption of data-driven strategies into a cohesive framework operationalizing and automating data governance.This work was partly supported by the DOGO4ML project, funded by the Spanish Ministerio de Ciencia e Innovación under project PID2020-117191RB-I00/AEI/10.13039/501100011033. Sergi Nadal is partly supported by the Spanish Ministerio de Ciencia e Innovación, as well as the European Union - NextGenerationEU, under project FJC2020-045809-I/AEI/10.13039/501100011033.Peer ReviewedPostprint (published version
Automated Change Rule Inference for Distance-Based API Misuse Detection
Developers build on Application Programming Interfaces (APIs) to reuse
existing functionalities of code libraries. Despite the benefits of reusing
established libraries (e.g., time savings, high quality), developers may
diverge from the API's intended usage; potentially causing bugs or, more
specifically, API misuses. Recent research focuses on developing techniques to
automatically detect API misuses, but many suffer from a high false-positive
rate. In this article, we improve on this situation by proposing ChaRLI (Change
RuLe Inference), a technique for automatically inferring change rules from
developers' fixes of API misuses based on API Usage Graphs (AUGs). By
subsequently applying graph-distance algorithms, we use change rules to
discriminate API misuses from correct usages. This allows developers to reuse
others' fixes of an API misuse at other code locations in the same or another
project. We evaluated the ability of change rules to detect API misuses based
on three datasets and found that the best mean relative precision (i.e., for
testable usages) ranges from 77.1 % to 96.1 % while the mean recall ranges from
0.007 % to 17.7 % for individual change rules. These results underpin that
ChaRLI and our misuse detection are helpful complements to existing API misuse
detectors
- …