179 research outputs found

    Anomaly Detection In Blockchain

    Get PDF
    Anomaly detection has been a well-studied area for a long time. Its applications in the financial sector have aided in identifying suspicious activities of hackers. However, with the advancements in the financial domain such as blockchain and artificial intelligence, it is more challenging to deceive financial systems. Despite these technological advancements many fraudulent cases have still emerged. Many artificial intelligence techniques have been proposed to deal with the anomaly detection problem; some results appear to be considerably assuring, but there is no explicit superior solution. This thesis leaps to bridge the gap between artificial intelligence and blockchain by pursuing various anomaly detection techniques on transactional network data of a public financial blockchain named 'Bitcoin'. This thesis also presents an overview of the blockchain technology and its application in the financial sector in light of anomaly detection. Furthermore, it extracts the transactional data of bitcoin blockchain and analyses for malicious transactions using unsupervised machine learning techniques. A range of algorithms such as isolation forest, histogram based outlier detection (HBOS), cluster based local outlier factor (CBLOF), principal component analysis (PCA), K-means, deep autoencoder networks and ensemble method are evaluated and compared

    Graph Mining for Cybersecurity: A Survey

    Full text link
    The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society. Securing cyberspace has become an utmost concern for organizations and governments. Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities. In recent years, with the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance. It is imperative to summarize existing graph-based cybersecurity solutions to provide a guide for future studies. Therefore, as a key contribution of this paper, we provide a comprehensive review of graph mining for cybersecurity, including an overview of cybersecurity tasks, the typical graph mining techniques, and the general process of applying them to cybersecurity, as well as various solutions for different cybersecurity tasks. For each task, we probe into relevant methods and highlight the graph types, graph approaches, and task levels in their modeling. Furthermore, we collect open datasets and toolkits for graph-based cybersecurity. Finally, we outlook the potential directions of this field for future research

    Behavioral analysis in cybersecurity using machine learning: a study based on graph representation, class imbalance and temporal dissection

    Get PDF
    The main goal of this thesis is to improve behavioral cybersecurity analysis using machine learning, exploiting graph structures, temporal dissection, and addressing imbalance problems.This main objective is divided into four specific goals: OBJ1: To study the influence of the temporal resolution on highlighting micro-dynamics in the entity behavior classification problem. In real use cases, time-series information could be not enough for describing the entity behavior classification. For this reason, we plan to exploit graph structures for integrating both structured and unstructured data in a representation of entities and their relationships. In this way, it will be possible to appreciate not only the single temporal communication but the whole behavior of these entities. Nevertheless, entity behaviors evolve over time and therefore, a static graph may not be enoughto describe all these changes. For this reason, we propose to use a temporal dissection for creating temporal subgraphs and therefore, analyze the influence of the temporal resolution on the graph creation and the entity behaviors within. Furthermore, we propose to study how the temporal granularity should be used for highlighting network micro-dynamics and short-term behavioral changes which can be a hint of suspicious activities. OBJ2: To develop novel sampling methods that work with disconnected graphs for addressing imbalanced problems avoiding component topology changes. Graph imbalance problem is a very common and challenging task and traditional graph sampling techniques that work directly on these structures cannot be used without modifying the graph’s intrinsic information or introducing bias. Furthermore, existing techniques have shown to be limited when disconnected graphs are used. For this reason, novel resampling methods for balancing the number of nodes that can be directly applied over disconnected graphs, without altering component topologies, need to be introduced. In particular, we propose to take advantage of the existence of disconnected graphs to detect and replicate the most relevant graph components without changing their topology, while considering traditional data-level strategies for handling the entity behaviors within. OBJ3: To study the usefulness of the generative adversarial networks for addressing the class imbalance problem in cybersecurity applications. Although traditional data-level pre-processing techniques have shown to be effective for addressing class imbalance problems, they have also shown downside effects when highly variable datasets are used, as it happens in cybersecurity. For this reason, new techniques that can exploit the overall data distribution for learning highly variable behaviors should be investigated. In this sense, GANs have shown promising results in the image and video domain, however, their extension to tabular data is not trivial. For this reason, we propose to adapt GANs for working with cybersecurity data and exploit their ability in learning and reproducing the input distribution for addressing the class imbalance problem (as an oversampling technique). Furthermore, since it is not possible to find a unique GAN solution that works for every scenario, we propose to study several GAN architectures with several training configurations to detect which is the best option for a cybersecurity application. OBJ4: To analyze temporal data trends and performance drift for enhancing cyber threat analysis. Temporal dynamics and incoming new data can affect the quality of the predictions compromising the model reliability. This phenomenon makes models get outdated without noticing. In this sense, it is very important to be able to extract more insightful information from the application domain analyzing data trends, learning processes, and performance drifts over time. For this reason, we propose to develop a systematic approach for analyzing how the data quality and their amount affect the learning process. Moreover, in the contextof CTI, we propose to study the relations between temporal performance drifts and the input data distribution for detecting possible model limitations, enhancing cyber threat analysis.Programa de Doctorado en Ciencias y Tecnologías Industriales (RD 99/2011) Industria Zientzietako eta Teknologietako Doktoretza Programa (ED 99/2011

    Detecting Selfish Mining Attacks Against a Blockchain Using Machine Learing

    Get PDF
    Selfish mining is an attack against a blockchain where miners hide newly discovered blocks instead of publishing them to the rest of the network. Selfish mining has been a potential issue for blockchains since it was first discovered by Eyal and Sirer. It can be used by malicious miners to earn a disproportionate share of the mining rewards or in conjunction with other attacks to steal money from network users. Several of these attacks were launched in 2018, 2019, and 2020 with the attackers stealing as much as $18 Million. Developers made several different attempts to fix this issue, but the effectiveness of the fixes is currently unknown. Despite the known vulnerability, there is little researching into detecting these attacks either historically or in real-time. In this research, we build a program to gather data from known selfish mining attacks against the Ethereum Classic blockchain. We then use this data to train a machine-learning algorithm to discover the important features for detecting selfish mining

    Novel Machine Learning Models Based Uncertainty Estimation and Sequential Predictions for Blockchain Networks

    Get PDF
    Cryptocurrencies based-blockchains – decentralised banks of public ledgers of transactions and pseudonymous identities – lure criminals to incognito behind an alphanumeric address to con- duct illicit activities over the network. Consequently, this double-edged sword technology urges the necessity of analysing blockchain data to detect illicit activities. In the existing literature, visual analytics have been widely used to gain useful insights from large-scale blockchain data via graph network analysis and visual analytics tools. However, a straightforward visualisation is not very effective with the increasing complexity of the blockchain network. On the other hand, a machine learning approach is capable of dealing with the massive amount of data generated by the public blockchain. Machine learning techniques have provided promising results in many big data applications e.g. social media, IoT, and recently blockchain. Nev- ertheless, the public blockchain is subject to unseen events that the machine learning model might not be aware of. In this work, machine learning models are developed in a novel way and uncertainty quantification methods are exploited to detect illicit activities in the public blockchain. Various supervised learning models are thoroughly explored using two datasets derived from Bitcoin and Ethereum blockchains. The unexpected events of blockchain are ad- dressed by computing uncertainty estimates besides the machine learning model’s predictions. Consequently, a novel uncertainty estimation method is proposed that reveals an effective per- formance in comparison to existing methods. In this context, uncertainty estimation reflects the model’s uncertain predictions about a given input. Subsequently, two distinct frame- works are presented that serve different purposes using blockchain data. The first framework is viewed as an end-to-end prototype of the temporal-GNN classification model – temporal graph neural network – based on an active learning process to reduce the laborious labelling process of blockchain data. In particular, the active learning approach utilises the predicted uncertainties to query the most informative data points where the active learning framework is performed and evaluated using a variety of acquisition functions. The other framework presents a novel model based on sequential predictions. This model refers to RecGNN – a recurrent graph convolutional network – that requires the predictions of the antecedent nodes in the Bitcoin transaction graph as input features. As a result, this project shows the effectiveness of using tree-based classifiers in classi- fying data derived from the public blockchain. This allows the detection of illicit activities (e.g. illicit transactions, users, accounts) that operates over the blockchain. It also highlights the competence of models based on graph learning algorithms throughout this project. The active learning frameworks using the temporal-GNN model have revealed promising results. Moreover, the highest performance provided by the RecGNN model is discussed to classify illicit Bitcoin transactions with an accuracy of 98.09% and f1-score of 91.75%. The experi- mental results are evaluated using classification metrics and other statistical measurements. The limitations, challenges and possible future directions are also demonstrated. Keywords: Machine Learning, Supervised Learning, Graph Neural Network, Temporal Graph Neural Network, Uncertainty Estimation, Bayesian Methods, Active Learning, Sequential Prediction, Blockchain

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 18th China Annual Conference on Cyber Security, CNCERT 2022, held in Beijing, China, in August 2022. The 17 papers presented were carefully reviewed and selected from 64 submissions. The papers are organized according to the following topical sections: ​​data security; anomaly detection; cryptocurrency; information security; vulnerabilities; mobile internet; threat intelligence; text recognition
    • …
    corecore