965 research outputs found

    Topological Anomaly Detection in Dynamic Multilayer Blockchain Networks

    Get PDF
    Motivated by the recent surge of criminal activities with cross-cryptocurrency trades, we introduce a new topological perspective to structural anomaly detection in dynamic multilayer networks. We postulate that anomalies in the underlying blockchain transaction graph that are composed of multiple layers are likely to also be manifested in anomalous patterns of the network shape properties. As such, we invoke the machinery of clique persistent homology on graphs to systematically and efficiently track evolution of the network shape and, as a result, to detect changes in the underlying network topology and geometry. We develop a new persistence summary for multilayer networks, called stacked persistence diagram, and prove its stability under input data perturbations. We validate our new topological anomaly detection framework in application to dynamic multilayer networks from the Ethereum Blockchain and the Ripple Credit Network, and demonstrate that our stacked PD approach substantially outperforms state-of-art techniques.Comment: 26 pages, 6 figures, 7 table

    Pump and Dumps in the Bitcoin Era: Real Time Detection of Cryptocurrency Market Manipulations

    Full text link
    In the last years, cryptocurrencies are increasingly popular. Even people who are not experts have started to invest in these securities and nowadays cryptocurrency exchanges process transactions for over 100 billion US dollars per month. However, many cryptocurrencies have low liquidity and therefore they are highly prone to market manipulation schemes. In this paper, we perform an in-depth analysis of pump and dump schemes organized by communities over the Internet. We observe how these communities are organized and how they carry out the fraud. Then, we report on two case studies related to pump and dump groups. Lastly, we introduce an approach to detect the fraud in real time that outperforms the current state of the art, so to help investors stay out of the market when a pump and dump scheme is in action.Comment: Accepted for publication at The 29th International Conference on Computer Communications and Networks (ICCCN 2020

    To the moon: defining and detecting cryptocurrency pump-and-dumps

    Get PDF
    Pump-and-dump schemes are fraudulent price manipulations through the spread of misinformation and have been around in economic settings since at least the 1700s. With new technologies around cryptocurrency trading, the problem has intensified to a shorter time scale and broader scope. The scientific literature on cryptocurrency pump-and-dump schemes is scarce, and government regulation has not yet caught up, leaving cryptocurrencies particularly vulnerable to this type of market manipulation. This paper examines existing information on pump-and-dump schemes from classical economic literature, synthesises this with cryptocurrencies, and proposes criteria that can be used to define a cryptocurrency pump-and-dump. These pump-and-dump patterns exhibit anomalous behaviour; thus, techniques from anomaly detection research are utilised to locate points of anomalous trading activity in order to flag potential pump-and-dump activity. The findings suggest that there are some signals in the trading data that might help detect pump-and-dump schemes, and we demonstrate these in our detection system by examining several real-world cases. Moreover, we found that fraudulent activity clusters on specific cryptocurrency exchanges and coins. The approach, data, and findings of this paper might form a basis for further research into this emerging fraud problem and could ultimately inform crime prevention

    Tools and algorithms to advance interactive intrusion analysis via Machine Learning and Information Retrieval

    Get PDF
    We consider typical tasks that arise in the intrusion analysis of log data from the perspectives of Machine Learning and Information Retrieval, and we study a number of data organization and interactive learning techniques to improve the analyst\u27s efficiency. In doing so, we attempt to translate intrusion analysis problems into the language of the abovementioned disciplines and to offer metrics to evaluate the effect of proposed techniques. The Kerf toolkit contains prototype implementations of these techniques, as well as data transformation tools that help bridge the gap between the real world log data formats and the ML and IR data models. We also describe the log representation approach that Kerf prototype tools are based on. In particular, we describe the connection between decision trees, automatic classification algorithms and log analysis techniques implemented in Kerf

    Fraud Pattern Detection for NFT Markets

    Get PDF
    Non-Fungible Tokens (NFTs) enable ownership and transfer of digital assets using blockchain technology. As a relatively new financial asset class, NFTs lack robust oversight and regulations. These conditions create an environment that is susceptible to fraudulent activity and market manipulation schemes. This study examines the buyer-seller network transactional data from some of the most popular NFT marketplaces (e.g., AtomicHub, OpenSea) to identify and predict fraudulent activity. To accomplish this goal multiple features such as price, volume, and network metrics were extracted from NFT transactional data. These were fed into a Multiple-Scale Convolutional Neural Network that predicts suspected fraudulent activity based on pattern recognition. This approach provides a more generic form of time series classification at different frequencies and timescales to recognize fraudulent NFT patterns. Results showed that over 80% of confirmed fraudulent cases were identified by modeling (recall). For every predicted fraud case, the model was correct 50% of the time (precision). Investors, regulators, and other entities can use these techniques to reduce risk exposure to NFT fraudulent activity

    Typicality distribution function:a new density-based data analytics tool

    Get PDF
    In this paper a new density-based, non-frequentistic data analytics tool, called typicality distribution function (TDF) is proposed. It is a further development of the recently introduced typicality- and eccentricity-based data analytics (TEDA) framework. The newly introduced TDF and its standardized form offer an effective alternative to the widely used probability distribution function (pdf), however, remaining free from the restrictive assumptions made and required by the latter. In particular, it offers an exact solution for any (except a single point) amount of non-coinciding data samples. For a comparison, that the well developed and widely used traditional probability theory and related statistical learning approaches require (theoretically) an infinitely large amount of data samples/ observations, although, in practice this requirement is often ignored. Furthermore, TDF does not require the user to pre-select or assume a particular distribution (e.g. Gaussian or other) or a mixture of such distributions or to pre-define the number of such distributions in a mixture. In addition, it does not require the individual data items to be independent. At the same time, the link with the traditional statistical approaches such as the well-known “nσ” analysis, Chebyshev inequality, etc. offers the interesting conclusion that without the restrictive prior assumptions listed above to which these traditional approaches are tied up the same type of analysis can be made using TDF automatically. TDF can provide valuable information for analysis of extreme processes, fault detection and identification were the amount of observations of extreme events or faults is usually disproportionally small. The newly proposed TDF offers a non-parametric, closed form analytical (quadratic) description extracted from the real data realizations exactly in contrast to the usual practice where such distributions are being pre-assumed or approximated. For example, so call- d particle filters are also a non-parametric approximation of the traditional statistics; however, they suffer from computational complexity and introduce a large number of dummy data. In addition to that, for several types of proximity/similarity measures (such as Euclidean, Mahalonobis, cosine) it can be calculated recursively, thus, computationally very efficiently and is suitable for real time and online algorithms. Moreover, with a very simple example, it has been illustrated that while traditional probability theory and related statistical approaches can lead in some cases to paradoxically incorrect results and/or to the need for hard prior assumptions to be made. In contrast, the newly proposed TDF can offer a logically meaningful result and an intuitive interpretation automatically and exactly without any prior assumptions. Finally, few simple univariate examples are provided and the process of inference is discussed and the future steps of the development of TDF and TEDA are outlined. Since it is a new fundamental theoretical innovation the areas of applications of TDF and TEDA can span from anomaly detection, clustering, classification, prediction, control, regression to (Kalman-like) filters. Practical applications can be even wider and, therefore, it is difficult to list all of them
    corecore