3,460 research outputs found
Big Data Testing Techniques: Taxonomy, Challenges and Future Trends
Big Data is reforming many industrial domains by providing decision support
through analyzing large data volumes. Big Data testing aims to ensure that Big
Data systems run smoothly and error-free while maintaining the performance and
quality of data. However, because of the diversity and complexity of data,
testing Big Data is challenging. Though numerous research efforts deal with Big
Data testing, a comprehensive review to address testing techniques and
challenges of Big Data is not available as yet. Therefore, we have
systematically reviewed the Big Data testing techniques evidence occurring in
the period 2010-2021. This paper discusses testing data processing by
highlighting the techniques used in every processing phase. Furthermore, we
discuss the challenges and future directions. Our findings show that diverse
functional, non-functional and combined (functional and non-functional) testing
techniques have been used to solve specific problems related to Big Data. At
the same time, most of the testing challenges have been faced during the
MapReduce validation phase. In addition, the combinatorial testing technique is
one of the most applied techniques in combination with other techniques (i.e.,
random testing, mutation testing, input space partitioning and equivalence
testing) to find various functional faults through Big Data testing.Comment: 32 page
Cyber Security
This open access book constitutes the refereed proceedings of the 17th International Annual Conference on Cyber Security, CNCERT 2021, held in Beijing, China, in AJuly 2021. The 14 papers presented were carefully reviewed and selected from 51 submissions. The papers are organized according to the following topical sections: ​data security; privacy protection; anomaly detection; traffic analysis; social network security; vulnerability detection; text classification
Do Androids Dream of Electric Sheep? On Privacy in the Android Supply Chain
The Android Open Source Project (AOSP) was first released by Google in 2008 and
has since become the most used operating system [Andaf]. Thanks to the openness
of its source code, any smartphone vendor or original equipment manufacturer
(OEM) can modify and adapt Android to their specific needs, or add proprietary features
before installing it on their devices in order to add custom features to differentiate themselves
from competitors. This has created a complex and diverse supply chain, completely opaque to
end-users, formed by manufacturers, resellers, chipset manufacturers, network operators, and
prominent actors of the online industry that partnered with OEMs. Each of these stakeholders
can pre-install extra apps, or implement proprietary features at the framework level.
However, such customizations can create privacy and security threats to end-users. Preinstalled
apps are privileged by the operating system, and can therefore access system APIs
or personal data more easily than apps installed by the user. Unfortunately, despite these
potential threats, there is currently no end-to-end control over what apps come pre-installed
on a device and why, and no traceability of the different software and hardware components
used in a given Android device. In fact, the landscape of pre-installed software in Android and
its security and privacy implications has largely remained unexplored by researchers.
In this thesis, I investigate the customization of Android devices and their impact on the
privacy and security of end-users. Specifically, I perform the first large-scale and systematic
analysis of pre-installed Android apps and the supply chain. To do so, I first develop an app,
Firmware Scanner [Sca], to crowdsource close to 34,000 Android firmware versions from 1,000
different OEMs from all over the world. This dataset allows us to map the stakeholders involved
in the supply chain and their relationships, from device manufacturers and mobile network operators
to third-party organizations like advertising and tracking services, and social network
platforms. I could identify multiple cases of privacy-invasive and potentially harmful behaviors.
My results show a disturbing lack of transparency and control over the Android supply
chain, thus showing that it can be damageable privacy- and security-wise to end-users.
Next, I study the evolution of the Android permission system, an essential security feature of the Android framework. Coupled with other protection mechanisms such as process sandboxing,
the permission system empowers users to control what sensitive resources (e.g., user
contacts, the camera, location sensors) are accessible to which apps. The research community
has extensively studied the permission system, but most previous studies focus on its limitations
or specific attacks. In this thesis, I present an up-to-date view and longitudinal analysis
of the evolution of the permissions system. I study how some lesser-known features of the
permission system, specifically permission flags, can impact the permission granting process,
making it either more restrictive or less. I then highlight how pre-installed apps developers
use said flags in the wild and focus on the privacy and security implications. Specifically, I
show the presence of third-party apps, installed as privileged system apps, potentially using
said features to share resources with other third-party apps.
Another salient feature of the permission system is its extensibility: apps can define their
own custom permissions to expose features and data to other apps. However, little is known
about how widespread the usage of custom permissions is, and what impact these permissions
may have on users’ privacy and security. In the last part of this thesis, I investigate the exposure
and request of custom permissions in the Android ecosystem and their potential for opening
privacy and security risks. I gather a 2.2-million-app-large dataset of both pre-installed and
publicly available apps using both Firmware Scanner and purpose-built app store crawlers.
I find the usage of custom permissions to be pervasive, regardless of the origin of the apps,
and seemingly growing over time. Despite this prevalence, I find that custom permissions are
virtually invisible to end-users, and their purpose is mostly undocumented. While Google recommends
that developers use their reverse domain name as the prefix of their custom permissions
[Gpla], I find widespread violations of this recommendation, making sound attribution
at scale virtually impossible. Through static analysis methods, I demonstrate that custom permissions
can facilitate access to permission-protected system resources to apps that lack those
permissions, without user awareness. Due to the lack of tools for studying such risks, I design
and implement two tools, PermissionTracer [Pere] and PermissionTainter [Perd] to study
custom permissions. I highlight multiple cases of concerning use of custom permissions by
Android apps in the wild.
In this thesis, I systematically studied, at scale, the vast and overlooked ecosystem of preinstalled
Android apps. My results show a complete lack of control of the supply chain which
is worrying, given the huge potential impact of pre-installed apps on the privacy and security
of end-users. I conclude with a number of open research questions and future avenues for
further research in the ecosystem of the supply chain of Android devices.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en IngenierĂa Telemática por la Universidad Carlos III de MadridPresidente: Douglas Leith.- Secretario: RubĂ©n Cuevas RumĂn.- Vocal: Hamed Haddad
Cyber Security
This open access book constitutes the refereed proceedings of the 17th International Annual Conference on Cyber Security, CNCERT 2021, held in Beijing, China, in AJuly 2021. The 14 papers presented were carefully reviewed and selected from 51 submissions. The papers are organized according to the following topical sections: ​data security; privacy protection; anomaly detection; traffic analysis; social network security; vulnerability detection; text classification
The Road to Transformation: Ascending from the Decade of Darkness
Nobody likes mistakes. Fewer yet like to revisit errors—to analyze, discuss or study them. They are often an embarrassment and remind us of our fallibility and shortcomings. It is always much easier to celebrate our achievements and successes—that leaves everyone with a warm feeling. However, although it is always preferable to avoid making mistakes, once they occur they are important and must be recognized as such. They speak to our weaknesses as both individuals and institutions. They are signals, if not alarms, to warn us of deficiencies that must be addressed. In fact, it has often been said for good reason that one can learn more from one’s mistakes than from one’s successes.
The military has always been bad at accepting this premise. Mistakes are often construed as a sign of weakness or inability and many perceive them as potential career-ending events. Such a zero tolerance to mistakes breeds an environment of risk aversion, micro-management and stagnation. It kills initiative and experimentation. And, it avoids examining mistakes in detail—lest blame insidiously spread its evil tentacles and taint others in the chain of command. However, this state affairs leads to atrophy within an organization.
It takes strong will and determination to break such a cycle. Normally, crisis is the only catalyst that compels leadership within an organization to take action, and even then it is difficult. The Department of National Defence (DND) and the Canadian Forces (CF), particularly the officer corps, found themselves in such a situation in the late 1980s and 1990s. By 1997, they were at the lowest ebb of their history. They had lost the confidence and trust of the government and Canadian people they served. They were stripped of their ability to investigate themselves. Furthermore, they were not trusted to implement the recommended changes forced upon them by the government and an external committee was established as a watchdog. Whether the leadership wanted to admit it or not, and they vehemently denied it at the time, there existed some substantial and deep rooted problems with DND, the CF and the officer corps. They were caught in a decade of darkness
Third Party Tracking in the Mobile Ecosystem
Third party tracking allows companies to identify users and track their
behaviour across multiple digital services. This paper presents an empirical
study of the prevalence of third-party trackers on 959,000 apps from the US and
UK Google Play stores. We find that most apps contain third party tracking, and
the distribution of trackers is long-tailed with several highly dominant
trackers accounting for a large portion of the coverage. The extent of tracking
also differs between categories of apps; in particular, news apps and apps
targeted at children appear to be amongst the worst in terms of the number of
third party trackers associated with them. Third party tracking is also
revealed to be a highly trans-national phenomenon, with many trackers operating
in jurisdictions outside the EU. Based on these findings, we draw out some
significant legal compliance challenges facing the tracking industry.Comment: Corrected missing company info (Linkedin owned by Microsoft). Figures
for Microsoft and Linkedin re-calculated and added to Table
Sécurité et protection de la vie privée dans les systèmes embarqués automobiles
Electronic equipment has become an integral part of a vehicle's network architecture, which consists of multiple buses and microcontrollers called Electronic Control Units (ECUs). These ECUs recently also connect to the outside world. Navigation and entertainment system, consumer devices, and Car2X functions are examples for this. Recent security analyses have shown severe vulnerabilities of exposed ECUs and protocols, which may make it possible for attackers to gain control over a vehicle. Given that car safety-critical systems can no longer be fully isolated from such third party devices and infotainment services, we propose a new approach to securing vehicular on-board systems that combines mechanisms at different layers of the communication stack and of the execution platforms. We describe our secure communication protocols, which are designed to provide strong cryptographic assurances together with an efficient implementation fitting the prevalent vehicular communication paradigms. They rely on hardware security modules providing secure storage and acting as root of trust. A distributed data flow tracking based approach is employed for checking code execution against a security policy describing authorized communication patterns. Binary instrumentation is used to track data flows throughout execution (taint engine) and also between control units (middleware), thus making it applicable to industrial applications. We evaluate the feasibility of our mechanisms to secure communication on the CAN bus, which is ubiquitously implemented in cars today. A proof of concept demonstrator also shows the feasibility of integrating security features into real vehicles.L'équipement électronique de bord est maintenant devenue partie intégrante de l'architecture réseau des véhicules. Elle s’appuie sur l'interconnexion de microcontroleurs appelés ECUs par des bus divers. On commence maintenant à connecter ces ECUs au monde extérieur, comme le montrent les systèmes de navigation, de divertissement, ou de communication mobile embarqués, et les fonctionnalités Car2X. Des analyses récentes ont montré de graves vulnérabilités des ECUs et protocoles employés qui permettent à un attaquant de prendre le contrôle du véhicule. Comme les systèmes critiques du véhicule ne peuvent plus être complètement isolés, nous proposons une nouvelle approche pour sécuriser l'informatique embarquée combinant des mécanismes à différents niveaux de la pile protocolaire comme des environnements d'exécution. Nous décrivons nos protocoles sécurisés qui s'appuient sur une cryptographie efficace et intégrée au paradigme de communication dominant dans l'automobile et sur des modules de sécurité matériels fournissant un stockage sécurisé et un noyau de confiance. Nous décrivons aussi comment surveiller les flux d'information distribués dans le véhicule pour assurer une exécution conforme à la politique de sécurité des communications. L'instrumentation binaire du code, nécessaire pour l’industrialisation, est utilisée pour réaliser cette surveillance durant l’exécution (par data tainting) et entre ECUs (dans l’intergiciel). Nous évaluons la faisabilité de nos mécanismes pour sécuriser la communication sur le bus CAN aujourd'hui omniprésent dans les véhicules. Une preuve de concept montre aussi la faisabilité d'intégrer des mécanismes de sécurité dans des véhicules réels
The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files
In many forensic investigations, questions linger regarding the identity of the authors of the software specimen. Research has identified methods for the attribution of binary files that have not been obfuscated, but a significant percentage of malicious software has been obfuscated in an effort to hide both the details of its origin and its true intent. Little research has been done around analyzing obfuscated code for attribution. In part, the reason for this gap in the research is that deobfuscation of an unknown program is a challenging task. Further, the additional transformation of the executable file introduced by the obfuscator modifies or removes features from the original executable that would have been used in the author attribution process. Existing research has demonstrated good success in attributing the authorship of an executable file of unknown provenance using methods based on static analysis of the specimen file. With the addition of file obfuscation, static analysis of files becomes difficult, time consuming, and in some cases, may lead to inaccurate findings. This paper presents a novel process for authorship attribution using dynamic analysis methods. A software emulated system was fully instrumented to become a test harness for a specimen of unknown provenance, allowing for supervised control, monitoring, and trace data collection during execution. This trace data was used as input into a supervised machine learning algorithm trained to identify stylometric differences in the specimen under test and provide predictions on who wrote the specimen. The specimen files were also analyzed for authorship using static analysis methods to compare prediction accuracies with prediction accuracies gathered from this new, dynamic analysis based method. Experiments indicate that this new method can provide better accuracy of author attribution for files of unknown provenance, especially in the case where the specimen file has been obfuscated
- …