30 research outputs found

    Security in Data Mining- A Comprehensive Survey

    Get PDF
    Data mining techniques, while allowing the individuals to extract hidden knowledge on one hand, introduce a number of privacy threats on the other hand. In this paper, we study some of these issues along with a detailed discussion on the applications of various data mining techniques for providing security. An efficient classification technique when used properly, would allow an user to differentiate between a phishing website and a normal website, to classify the users as normal users and criminals based on their activities on Social networks (Crime Profiling) and to prevent users from executing malicious codes by labelling them as malicious. The most important applications of Data mining is the detection of intrusions, where different Data mining techniques can be applied to effectively detect an intrusion and report in real time so that necessary actions are taken to thwart the attempts of the intruder. Privacy Preservation, Outlier Detection, Anomaly Detection and PhishingWebsite Classification are discussed in this paper

    Machine learning techniques for identification using mobile and social media data

    Get PDF
    Networked access and mobile devices provide near constant data generation and collection. Users, environments, applications, each generate different types of data; from the voluntarily provided data posted in social networks to data collected by sensors on mobile devices, it is becoming trivial to access big data caches. Processing sufficiently large amounts of data results in inferences that can be characterized as privacy invasive. In order to address privacy risks we must understand the limits of the data exploring relationships between variables and how the user is reflected in them. In this dissertation we look at data collected from social networks and sensors to identify some aspect of the user or their surroundings. In particular, we find that from social media metadata we identify individual user accounts and from the magnetic field readings we identify both the (unique) cellphone device owned by the user and their course-grained location. In each project we collect real-world datasets and apply supervised learning techniques, particularly multi-class classification algorithms to test our hypotheses. We use both leave-one-out cross validation as well as k-fold cross validation to reduce any bias in the results. Throughout the dissertation we find that unprotected data reveals sensitive information about users. Each chapter also contains a discussion about possible obfuscation techniques or countermeasures and their effectiveness with regards to the conclusions we present. Overall our results show that deriving information about users is attainable and, with each of these results, users would have limited if any indication that any type of analysis was taking place

    Security in Data Mining-A Comprehensive Survey

    Get PDF
    Data mining techniques, while allowing the individuals to extract hidden knowledge on one hand, introduce a number of privacy threats on the other hand. In this paper, we study some of these issues along with a detailed discussion on the applications of various data mining techniques for providing security. An efficient classification technique when used properly, would allow an user to differentiate between a phishing website and a normal website, to classify the users as normal users and criminals based on their activities on Social networks (Crime Profiling) and to prevent users from executing malicious codes by labelling them as malicious. The most important applications of Data mining is the detection of intrusions, where different Data mining techniques can be applied to effectively detect an intrusion and report in real time so that necessary actions are taken to thwart the attempts of the intruder

    Detection of Anomalous Behavior of IoT/CPS Devices Using Their Power Signals

    Get PDF
    Embedded computing devices, in the Internet of Things (IoT) or Cyber-Physical Systems (CPS), are becoming pervasive in many domains around the world. Their wide deployment in simple applications (e.g., smart buildings, fleet management, and smart agriculture) or in more critical operations (e.g., industrial control, smart power grids, and self-driving cars) creates significant market potential ($ 4-11 trillion in annual revenue is expected by 2025). A main requirement for the success of such systems and applications is the capacity to ensure the performance of these devices. This task includes equipping them to be resilient against security threats and failures. Globally, several critical infrastructure applications have been the target of cyber attacks. These recent incidents, as well as the rich applicable literature, confirm that more research is needed to overcome such challenges. Consequently, the need for robust approaches that detect anomalous behaving devices in security and safety-critical applications has become paramount. Solving such a problem minimizes different kinds of losses (e.g., confidential data theft, financial loss, service access restriction, or even casualties). In light of the aforementioned motivation and discussion, this thesis focuses on the problem of detecting the anomalous behavior of IoT/CPS devices by considering their side-channel information. Solving such a problem is extremely important in maintaining the security and dependability of critical systems and applications. Although several side-channel based approaches are found in the literature, there are still important research gaps that need to be addressed. First, the intrusive nature of the monitoring in some of the proposed techniques results in resources overhead and requires instrumentation of the internal components of a device, which makes them impractical. It also raises a data integrity flag. Second, the lack of realistic experimental power consumption datasets that reflect the normal and anomalous behaviors of IoT and CPS devices has prevented fair and coherent comparisons with the state of the art in this domain. Finally, most of the research to date has concentrated on the accuracy of detection and not the novelty of detecting new anomalies. Such a direction relies on: (i) the availability of labeled datasets; (ii) the complexity of the extracted features; and (iii) the available compute resources. These assumptions and requirements are usually unrealistic and unrepresentative. This research aims to bridge these gaps as follows. First, this study extends the state of the art that adopts the idea of leveraging the power consumption of devices as a signal and the concept of decoupling the monitoring system and the devices to be monitored to detect and classify the "operational health'' of the devices. Second, this thesis provides and builds power consumption-based datasets that can be utilized by AI as well as security research communities to validate newly developed detection techniques. The collected datasets cover a wide range of anomalous device behavior due to the main aspects of device security (i.e., confidentiality, integrity, and availability) and partial system failures. The extensive experiments include: a wide spectrum of various emulated malware scenarios; five real malware applications taken from the well-known Drebin dataset; distributed denial of service attack (DDOS) where an IoT device is treated as: (1) a victim of a DDOS attack, and (2) the source of a DDOS attack; cryptomining malware where the resources of an IoT device are being hijacked to be used to advantage of the attacker’s wish and desire; and faulty CPU cores. This level of extensive validation has not yet been reported in any study in the literature. Third, this research presents a novel supervised technique to detect anomalous device behavior based on transforming the problem into an image classification problem. The main aim of this methodology is to improve the detection performance. In order to achieve the goals of this study, the methodology combines two powerful computer vision tools, namely Histograms of Oriented Gradients (HOG) and a Convolutional Neural Network (CNN). Such a detection technique is not only useful in this present case but can contribute to most time-series classification (TSC) problems. Finally, this thesis proposes a novel unsupervised detection technique that requires only the normal behavior of a device in the training phase. Therefore, this methodology aims at detecting new/unseen anomalous behavior. The methodology leverages the power consumption of a device and Restricted Boltzmann Machine (RBM) AutoEncoders (AE) to build a model that makes them more robust to the presence of security threats. The methodology makes use of stacked RBM AE and Principal Component Analysis (PCA) to extract feature vector based on AE's reconstruction errors. A One-Class Support Vector Machine (OC-SVM) classifier is then trained to perform the detection task. Across 18 different datasets, both of our proposed detection techniques demonstrated high detection performance with at least ~ 88% accuracy and 85% F-Score on average. The empirical results indicate the effectiveness of the proposed techniques and demonstrated improved detection performance gain of 9% - 17% over results reported in other methods

    Security in Computer and Information Sciences

    Get PDF
    This open access book constitutes the thoroughly refereed proceedings of the Second International Symposium on Computer and Information Sciences, EuroCybersec 2021, held in Nice, France, in October 2021. The 9 papers presented together with 1 invited paper were carefully reviewed and selected from 21 submissions. The papers focus on topics of security of distributed interconnected systems, software systems, Internet of Things, health informatics systems, energy systems, digital cities, digital economy, mobile networks, and the underlying physical and network infrastructures. This is an open access book

    Security Issues of Mobile and Smart Wearable Devices

    Get PDF
    Mobile and smart devices (ranging from popular smartphones and tablets to wearable fitness trackers equipped with sensing, computing and networking capabilities) have proliferated lately and redefined the way users carry out their day-to-day activities. These devices bring immense benefits to society and boast improved quality of life for users. As mobile and smart technologies become increasingly ubiquitous, the security of these devices becomes more urgent, and users should take precautions to keep their personal information secure. Privacy has also been called into question as so many of mobile and smart devices collect, process huge quantities of data, and store them on the cloud as a matter of fact. Ensuring confidentiality, integrity, and authenticity of the information is a cybersecurity challenge with no easy solution. Unfortunately, current security controls have not kept pace with the risks posed by mobile and smart devices, and have proven patently insufficient so far. Thwarting attacks is also a thriving research area with a substantial amount of still unsolved problems. The pervasiveness of smart devices, the growing attack vectors, and the current lack of security call for an effective and efficient way of protecting mobile and smart devices. This thesis deals with the security problems of mobile and smart devices, providing specific methods for improving current security solutions. Our contributions are grouped into two related areas which present natural intersections and corresponds to the two central parts of this document: (1) Tackling Mobile Malware, and (2) Security Analysis on Wearable and Smart Devices. In the first part of this thesis, we study methods and techniques to assist security analysts to tackle mobile malware and automate the identification of malicious applications. We provide threefold contributions in tackling mobile malware: First, we introduce a Secure Message Delivery (SMD) protocol for Device-to-Device (D2D) networks, with primary objective of choosing the most secure path to deliver a message from a sender to a destination in a multi-hop D2D network. Second, we illustrate a survey to investigate concrete and relevant questions concerning Android code obfuscation and protection techniques, where the purpose is to review code obfuscation and code protection practices. We evaluate efficacy of existing code de-obfuscation tools to tackle obfuscated Android malware (which provide attackers with the ability to evade detection mechanisms). Finally, we propose a Machine Learning-based detection framework to hunt malicious Android apps by introducing a system to detect and classify newly-discovered malware through analyzing applications. The proposed system classifies different types of malware from each other and helps to better understanding how malware can infect devices, the threat level they pose and how to protect against them. Our designed system leverages more complete coverage of apps’ behavioral characteristics than the state-of-the-art, integrates the most performant classifier, and utilizes the robustness of extracted features. The second part of this dissertation conducts an in-depth security analysis of the most popular wearable fitness trackers on the market. Our contributions are grouped into four central parts in this domain: First, we analyze the primitives governing the communication between fitness tracker and cloud-based services. In addition, we investigate communication requirements in this setting such as: (i) Data Confidentiality, (ii) Data Integrity, and (iii) Data Authenticity. Second, we show real-world demos on how modern wearable devices are vulnerable to false data injection attacks. Also, we document successful injection of falsified data to cloud-based services that appears legitimate to the cloud to obtain personal benefits. Third, we circumvent End-to-End protocol encryption implemented in the most advanced and secure fitness trackers (e.g., Fitbit, as the market leader) through Hardware-based reverse engineering. Last but not least, we provide guidelines for avoiding similar vulnerabilities in future system designs

    Trustworthy machine learning through the lens of privacy and security

    Get PDF
    Nowadays, machine learning (ML) becomes ubiquitous and it is transforming society. However, there are still many incidents caused by ML-based systems when ML is deployed in real-world scenarios. Therefore, to allow wide adoption of ML in the real world, especially in critical applications such as healthcare, finance, etc., it is crucial to develop ML models that are not only accurate but also trustworthy (e.g., explainable, privacy-preserving, secure, and robust). Achieving trustworthy ML with different machine learning paradigms (e.g., deep learning, centralized learning, federated learning, etc.), and application domains (e.g., computer vision, natural language, human study, malware systems, etc.) is challenging, given the complicated trade-off among utility, scalability, privacy, explainability, and security. To bring trustworthy ML to real-world adoption with the trust of communities, this study makes a contribution of introducing a series of novel privacy-preserving mechanisms in which the trade-off between model utility and trustworthiness is optimized in different application domains, including natural language models, federated learning with human and mobile sensing applications, image classification, and explainable AI. The proposed mechanisms reach deployment levels of commercialized systems in real-world trials while providing trustworthiness with marginal utility drops and rigorous theoretical guarantees. The developed solutions enable safe, efficient, and practical analyses of rich and diverse user-generated data in many application domains

    Malware Propagation in Online Social Networks: Modeling, Analysis and Real-world Implementations

    Get PDF
    The popularity and wide spread usage of online social networks (OSNs) have attracted hackers and cyber criminals to use OSNs as an attack platform to spread malware. Over the last few years, Facebook users have experienced hundreds of malware attacks. A successful attack can lead to tens of millions of OSN accounts being compromised and computers being infected. Cyber criminals can mount massive denial of service attacks against Internet infrastructures or systems using compromised accounts and computers. Malware infecting a user's computer have the ability to steal login credentials and other confidential information stored on the computer, install ransomware and infect other computers on the same network. Therefore, it is important to understand propagation dynamics of malware in OSNs in order to detect, contain and remove them as early as possible. The objective of this dissertation is thus to model and study propagation dynamics of various types of malware in social networks such as Facebook, LinkedIn and Orkut. In particular, - we propose analytical models that characterize propagation dynamics of cross-site scripting and Trojan malware, the two major types of malware propagating in OSNs. Our models assume the topological characteristics of real-world social networks, namely, low average shortest distance, power-law distribution of node degrees and high clustering coefficient. The proposed models were validated using a real-world social network graph. - we present the design and implementation of a cellular botnet named SoCellBot that uses the OSN platform as a means to recruit and control cellular bots on smartphones. SoCellBot utilizes OSN messaging systems as communication channels between bots. We then present a simulation-based analysis of the botnet's strategies to maximize the number of infected victims within a short amount of time and, at the same time, minimize the risk of being detected. - we describe and analyze emerging malware threats in OSNs, namely, clickjacking, extension-based and Magnet malware. We discuss their implementations and working mechanics, and analyze their propagation dynamics via simulations. - we evaluate the performance of several selective monitoring schemes used for malware detection in OSNs. With selective monitoring, we select a set of important users in the network and monitor their and their friends activities and posts for malware threats. These schemes differ in how the set of important users is selected. We evaluate and compare the effectiveness of several selective monitoring schemes in terms of malware detection in OSNs

    Data Mining

    Get PDF
    The availability of big data due to computerization and automation has generated an urgent need for new techniques to analyze and convert big data into useful information and knowledge. Data mining is a promising and leading-edge technology for mining large volumes of data, looking for hidden information, and aiding knowledge discovery. It can be used for characterization, classification, discrimination, anomaly detection, association, clustering, trend or evolution prediction, and much more in fields such as science, medicine, economics, engineering, computers, and even business analytics. This book presents basic concepts, ideas, and research in data mining
    corecore