411 research outputs found

    Security of Graph Data: Hashing Schemes and Definitions

    Get PDF
    Use of graph-structured data models is on the rise - in graph databases, in representing biological and healthcare data as well as geographical data. In order to secure graph-structured data, and develop cryptographically secure schemes for graph databases, it is essential to formally define and develop suitable collision resistant one-way hashing schemes and show them they are efficient. The widely used Merkle hash technique is not suitable as it is, because graphs may be directed acyclic ones or cyclic ones. In this paper, we are addressing this problem. Our contributions are: (1) define the practical and formal security model of hashing schemes for graphs, (2) define the formal security model of perfectly secure hashing schemes, (3) describe constructions of hashing and perfectly secure hashing of graphs, and (4) performance results for the constructions. Our constructions use graph traversal techniques, and are highly efficient for hashing, redaction, and verification of hashes graphs. We have implemented the proposed schemes, and our performance analysis on both real and synthetic graph data sets support our claims

    Privacy, Access Control, and Integrity for Large Graph Databases

    Get PDF
    Graph data are extensively utilized in social networks, collaboration networks, geo-social networks, and communication networks. Their growing usage in cyberspaces poses daunting security and privacy challenges. Data publication requires privacy-protection mechanisms to guard against information breaches. In addition, access control mechanisms can be used to allow controlled sharing of data. Provision of privacy-protection, access control, and data integrity for graph data require a holistic approach for data management and secure query processing. This thesis presents such an approach. In particular, the thesis addresses two notable challenges for graph databases, which are: i) how to ensure users\u27 privacy in published graph data under an access control policy enforcement, and ii) how to verify the integrity and query results of graph datasets. To address the first challenge, a privacy-protection framework under role-based access control (RBAC) policy constraints is proposed. The design of such a framework poses a trade-off problem, which is proved to be NP-complete. Novel heuristic solutions are provided to solve the constraint problem. To the best of our knowledge, this is the first scheme that studies the trade-off between RBAC policy constraints and privacy-protection for graph data. To address the second challenge, a cryptographic security model based on Hash Message Authentic Codes (HMACs) is proposed. The model ensures integrity and completeness verification of data and query results under both two-party and third-party data distribution environments. Unique solutions based on HMACs for integrity verification of graph data are developed and detailed security analysis is provided for the proposed schemes. Extensive experimental evaluations are conducted to illustrate the performance of proposed algorithms

    Evaluating Machine Learning Techniques for Smart Home Device Classification

    Get PDF
    Smart devices in the Internet of Things (IoT) have transformed the management of personal and industrial spaces. Leveraging inexpensive computing, smart devices enable remote sensing and automated control over a diverse range of processes. Even as IoT devices provide numerous benefits, it is vital that their emerging security implications are studied. IoT device design typically focuses on cost efficiency and time to market, leading to limited built-in encryption, questionable supply chains, and poor data security. In a 2017 report, the United States Government Accountability Office recommended that the Department of Defense investigate the risks IoT devices pose to operations security, information leakage, and endangerment of senior leaders [1]. Recent research has shown that it is possible to model a subject’s pattern-of-life through data leakage from Bluetooth Low Energy (BLE) and Wi-Fi smart home devices [2]. A key step in establishing pattern-of-life is the identification of the device types within the smart home. Device type is defined as the functional purpose of the IoT device, e.g., camera, lock, and plug. This research hypothesizes that machine learning algorithms can be used to accurately perform classification of smart home devices. To test this hypothesis, a Smart Home Environment (SHE) is built using a variety of commercially-available BLE and Wi-Fi devices. SHE produces actual smart device traffic that is used to create a dataset for machine learning classification. Six device types are included in SHE: door sensors, locks, and temperature sensors using BLE, and smart bulbs, cameras, and smart plugs using Wi-Fi. In addition, a device classification pipeline (DCP) is designed to collect and preprocess the wireless traffic, extract features, and produce tuned models for testing. K-nearest neighbors (KNN), linear discriminant analysis (LDA), and random forests (RF) classifiers are built and tuned for experimental testing. During this experiment, the classifiers are tested on their ability to distinguish device types in a multiclass classification scheme. Classifier performance is evaluated using the Matthews correlation coefficient (MCC), mean recall, and mean precision metrics. Using all available features, the classifier with the best overall performance is the KNN classifier. The KNN classifier was able to identify BLE device types with an MCC of 0.55, a mean precision of 54%, and a mean recall of 64%, and Wi-Fi device types with an MCC of 0.71, a mean precision of 81%, and a mean recall of 81%. Experimental results provide support towards the hypothesis that machine learning can classify IoT device types to a high level of performance, but more work is necessary to build a more robust classifier

    Your WiFi Is Leaking: Inferring Private User Information Despite Encryption

    Get PDF
    This thesis describes how wireless networks can inadvertently leak and broadcast users' personal information despite the correct use of encryption. Users would likely assume that their activities (for example, the program or app they are using) and personal information (including age, religion, sexuality and gender) would remain confidential when using an encrypted network. However, we demonstrate how the analysis of encrypted traffic patterns can allow an observer to infer potentially sensitive data remotely, passively, undetectably, and without any network credentials. Without the ability to read encrypted WiFi traffic directly, the limited side-channel data available is processed. Following an investigation to determine what information is available and how it can be represented, it was determined that the comparison of various permutations of timing and frame size information is sufficient to distinguish specific user activities. The construction of classifiers via machine learning (Random Forests) utilising this side-channel information represented as histograms allows for the detection of user activity despite WiFi encryption. Studies showed that Skype voice traffic could be identified despite being interleaved with other activities. A subsequent study then demonstrated that mobile apps could be individually detected and, concerningly, used to infer potentially sensitive information about users due to their personalised nature. Furthermore, a full prototype system is developed and used to demonstrate that this analysis can be performed in real-time using low-cost commodity hardware in real-world scenarios. Avenues for improvement and the limitations of this approach are identified, and potential applications for this work are considered. Strategies to prevent these leaks are discussed and the effort required for an observer to present a practical privacy threat to the everyday WiFi user is examined

    Hardware Trojan Detection Using Machine Learning

    Get PDF
    The cyber-physical system’s security depends on the software and underlying hardware. In today’s times, securing hardware is difficult because of the globalization of the Integrated circuit’s manufacturing process. The main attack is to insert a “backdoor” that maliciously alters the original circuit’s behaviour. Such a malicious insertion is called a hardware trojan. In this thesis, the Random Forest Model has proposed for hardware trojan detection and this research focuses on improving the detection accuracy of the Random Forest model. The detection technique used the random forest machine learning model, which was trained by using the power traces of the circuit behaviour. The data required for training was obtained from an extensive database by simulating the circuit behaviours with various input vectors. The machine learning model was then compared with the state-of-art models in terms of accuracy in detecting malicious hardware. Our results show that the Random Forest classifier achieves an accuracy of 99.80 percent with a false positive rate (FPR)of 0.009 and a false negative rate (FNR) of 0.038 when the model is created to detect hardware trojans. Furthermore, our research shows that a trained model takes less training time and can be applied to large and complex datasets

    Secure Outsourced Computation on Encrypted Data

    Get PDF
    Homomorphic encryption (HE) is a promising cryptographic technique that supports computations on encrypted data without requiring decryption first. This ability allows sensitive data, such as genomic, financial, or location data, to be outsourced for evaluation in a resourceful third-party such as the cloud without compromising data privacy. Basic homomorphic primitives support addition and multiplication on ciphertexts. These primitives can be utilized to represent essential computations, such as logic gates, which subsequently can support more complex functions. We propose the construction of efficient cryptographic protocols as building blocks (e.g., equality, comparison, and counting) that are commonly used in data analytics and machine learning. We explore the use of these building blocks in two privacy-preserving applications. One application leverages our secure prefix matching algorithm, which builds on top of the equality operation, to process geospatial queries on encrypted locations. The other applies our secure comparison protocol to perform conditional branching in private evaluation of decision trees. There are many outsourced computations that require joint evaluation on private data owned by multiple parties. For example, Genome-Wide Association Study (GWAS) is becoming feasible because of the recent advances of genome sequencing technology. Due to the sensitivity of genomic data, this data is encrypted using different keys possessed by different data owners. Computing on ciphertexts encrypted with multiple keys is a non-trivial task. Current solutions often require a joint key setup before any computation such as in threshold HE or incur large ciphertext size (at best, grows linearly in the number of involved keys) such as in multi-key HE. We propose a hybrid approach that combines the advantages of threshold and multi-key HE to support computations on ciphertexts encrypted with different keys while vastly reducing ciphertext size. Moreover, we propose the SparkFHE framework to support large-scale secure data analytics in the Cloud. SparkFHE integrates Apache Spark with Fully HE to support secure distributed data analytics and machine learning and make two novel contributions: (1) enabling Spark to perform efficient computation on large datasets while preserving user privacy, and (2) accelerating intensive homomorphic computation through parallelization of tasks across clusters of computing nodes. To our best knowledge, SparkFHE is the first addressing these two needs simultaneously

    Deteção de propagação de ameaças e exfiltração de dados em redes empresariais

    Get PDF
    Modern corporations face nowadays multiple threats within their networks. In an era where companies are tightly dependent on information, these threats can seriously compromise the safety and integrity of sensitive data. Unauthorized access and illicit programs comprise a way of penetrating the corporate networks, able to traversing and propagating to other terminals across the private network, in search of confidential data and business secrets. The efficiency of traditional security defenses are being questioned with the number of data breaches occurred nowadays, being essential the development of new active monitoring systems with artificial intelligence capable to achieve almost perfect detection in very short time frames. However, network monitoring and storage of network activity records are restricted and limited by legal laws and privacy strategies, like encryption, aiming to protect the confidentiality of private parties. This dissertation proposes methodologies to infer behavior patterns and disclose anomalies from network traffic analysis, detecting slight variations compared with the normal profile. Bounded by network OSI layers 1 to 4, raw data are modeled in features, representing network observations, and posteriorly, processed by machine learning algorithms to classify network activity. Assuming the inevitability of a network terminal to be compromised, this work comprises two scenarios: a self-spreading force that propagates over internal network and a data exfiltration charge which dispatch confidential info to the public network. Although features and modeling processes have been tested for these two cases, it is a generic operation that can be used in more complex scenarios as well as in different domains. The last chapter describes the proof of concept scenario and how data was generated, along with some evaluation metrics to perceive the model’s performance. The tests manifested promising results, ranging from 96% to 99% for the propagation case and 86% to 97% regarding data exfiltration.Nos dias de hoje, várias organizações enfrentam múltiplas ameaças no interior da sua rede. Numa época onde as empresas dependem cada vez mais da informação, estas ameaças podem compremeter seriamente a segurança e a integridade de dados confidenciais. O acesso não autorizado e o uso de programas ilícitos constituem uma forma de penetrar e ultrapassar as barreiras organizacionais, sendo capazes de propagarem-se para outros terminais presentes no interior da rede privada com o intuito de atingir dados confidenciais e segredos comerciais. A eficiência da segurança oferecida pelos sistemas de defesa tradicionais está a ser posta em causa devido ao elevado número de ataques de divulgação de dados sofridos pelas empresas. Desta forma, o desenvolvimento de novos sistemas de monitorização ativos usando inteligência artificial é crucial na medida de atingir uma deteção mais precisa em curtos períodos de tempo. No entanto, a monitorização e o armazenamento dos registos da atividade da rede são restritos e limitados por questões legais e estratégias de privacidade, como a cifra dos dados, visando proteger a confidencialidade das entidades. Esta dissertação propõe metodologias para inferir padrões de comportamento e revelar anomalias através da análise de tráfego que passa na rede, detetando pequenas variações em comparação com o perfil normal de atividade. Delimitado pelas camadas de rede OSI 1 a 4, os dados em bruto são modelados em features, representando observações de rede e, posteriormente, processados por algoritmos de machine learning para classificar a atividade de rede. Assumindo a inevitabilidade de um terminal ser comprometido, este trabalho compreende dois cenários: um ataque que se auto-propaga sobre a rede interna e uma tentativa de exfiltração de dados que envia informações para a rede pública. Embora os processos de criação de features e de modelação tenham sido testados para estes dois casos, é uma operação genérica que pode ser utilizada em cenários mais complexos, bem como em domínios diferentes. O último capítulo inclui uma prova de conceito e descreve o método de criação dos dados, com a utilização de algumas métricas de avaliação de forma a espelhar a performance do modelo. Os testes mostraram resultados promissores, variando entre 96% e 99% para o caso da propagação e entre 86% e 97% relativamente ao roubo de dados.Mestrado em Engenharia de Computadores e Telemátic

    Advances in Information Security and Privacy

    Get PDF
    With the recent pandemic emergency, many people are spending their days in smart working and have increased their use of digital resources for both work and entertainment. The result is that the amount of digital information handled online is dramatically increased, and we can observe a significant increase in the number of attacks, breaches, and hacks. This Special Issue aims to establish the state of the art in protecting information by mitigating information risks. This objective is reached by presenting both surveys on specific topics and original approaches and solutions to specific problems. In total, 16 papers have been published in this Special Issue

    Cyber Security and Critical Infrastructures

    Get PDF
    This book contains the manuscripts that were accepted for publication in the MDPI Special Topic "Cyber Security and Critical Infrastructure" after a rigorous peer-review process. Authors from academia, government and industry contributed their innovative solutions, consistent with the interdisciplinary nature of cybersecurity. The book contains 16 articles: an editorial explaining current challenges, innovative solutions, real-world experiences including critical infrastructure, 15 original papers that present state-of-the-art innovative solutions to attacks on critical systems, and a review of cloud, edge computing, and fog's security and privacy issues
    corecore