545 research outputs found

    SourceP: Smart Ponzi Schemes Detection on Ethereum Using Pre-training Model with Data Flow

    Full text link
    As blockchain technology becomes more and more popular, a typical financial scam, the Ponzi scheme, has also emerged in the blockchain platform Ethereum. This Ponzi scheme deployed through smart contracts, also known as the smart Ponzi scheme, has caused a lot of economic losses and negative impacts. Existing methods for detecting smart Ponzi schemes on Ethereum mainly rely on bytecode features, opcode features, account features, and transaction behavior features of smart contracts, and such methods lack interpretability and sustainability. In this paper, we propose SourceP, a method to detect smart Ponzi schemes on the Ethereum platform using pre-training models and data flow, which only requires using the source code of smart contracts as features to explore the possibility of detecting smart Ponzi schemes from another direction. SourceP reduces the difficulty of data acquisition and feature extraction of existing detection methods while increasing the interpretability of the model. Specifically, we first convert the source code of a smart contract into a data flow graph and then introduce a pre-training model based on learning code representations to build a classification model to identify Ponzi schemes in smart contracts. The experimental results show that SourceP achieves 87.2\% recall and 90.7\% F-score for detecting smart Ponzi schemes within Ethereum's smart contract dataset, outperforming state-of-the-art methods in terms of performance and sustainability. We also demonstrate through additional experiments that pre-training models and data flow play an important contribution to SourceP, as well as proving that SourceP has a good generalization ability.Comment: 12 page

    An Organized Repository of Ethereum Smart Contracts’ Source Codes and Metrics

    Get PDF
    International audienceMany empirical software engineering studies show that there is a need for repositories where source codes are acquired, filtered and classified. During the last few years, Ethereum block explorer services have emerged as a popular project to explore and search for Ethereum blockchain data such as transactions, addresses, tokens, smart contracts’ source codes, prices and other activities taking place on the Ethereum blockchain. Despite the availability of this kind of service, retrieving specific information useful to empirical software engineering studies, such as the study of smart contracts’ software metrics, might require many subtasks, such as searching for specific transactions in a block, parsing files in HTML format, and filtering the smart contracts to remove duplicated code or unused smart contracts. In this paper, we afford this problem by creating Smart Corpus, a corpus of smart contracts in an organized, reasoned and up-to-date repository where Solidity source code and other metadata about Ethereum smart contracts can easily and systematically be retrieved. We present Smart Corpus’s design and its initial implementation, and we show how the data set of smart contracts’ source codes in a variety of programming languages can be queried and processed to get useful information on smart contracts and their software metrics. Smart Corpus aims to create a smart-contract repository where smart-contract data (source code, application binary interface (ABI) and byte code) are freely and immediately available and are classified based on the main software metrics identified in the scientific literature. Smart contracts’ source codes have been validated by EtherScan, and each contract comes with its own associated software metrics as computed by the freely available software PASO. Moreover, Smart Corpus can be easily extended as the number of new smart contracts increases day by day

    Smart Contract Upgradeability on the Ethereum Blockchain Platform: An Exploratory Study

    Full text link
    Context: Smart contracts are computerized self-executing contracts that contain clauses, which are enforced once certain conditions are met. Smart contracts are immutable by design and cannot be modified once deployed, which ensures trustlessness. Despite smart contracts' immutability benefits, upgrading contract code is still necessary for bug fixes and potential feature improvements. In the past few years, the smart contract community introduced several practices for upgrading smart contracts. Upgradeable contracts are smart contracts that exhibit these practices and are designed with upgradeability in mind. During the upgrade process, a new smart contract version is deployed with the desired modification, and subsequent user requests will be forwarded to the latest version (upgraded contract). Nevertheless, little is known about the characteristics of the upgrading practices, how developers apply them, and how upgrading impacts contract usage. Objectives: This paper aims to characterize smart contract upgrading patterns and analyze their prevalence based on the deployed contracts that exhibit these patterns. Furthermore, we intend to investigate the reasons why developers upgrade contracts (e.g., introduce features, fix vulnerabilities) and how upgrades affect the adoption and life span of a contract in practice. Method: We collect deployed smart contracts metadata and source codes to identify contracts that exhibit certain upgrade patterns (upgradeable contracts) based on a set of policies. Then we trace smart contract versions for each upgradable contract and identify the changes in contract versions using similarity and vulnerabilities detection tools. Finally, we plan to analyze the impact of upgrading on contract usage based on the number of transactions received and the lifetime of the contract version

    A Blockchain-based Model for Securing Data Pipeline in a Heterogeneous Information System

    Full text link
    In our digital world, access to personal and public data has become an item of concern, with challenging security and privacy aspects. Modern information systems are heterogeneous in nature and have an inherent security vulnerability, which is susceptible to data interception and data modification due to unsecured communication data pipelines between connected endpoints. This re-search article presents a blockchain-based model for securing data pipelines in a heterogeneous information system using an integrated multi-hazard early warning system (MHEWS) as a case study. The proposed model utilizes the inherent security features of blockchain technology to address the security and privacy concerns that arise in data pipelines. The model is designed to ensure data integrity, confidentiality, and authenticity in a decentralized manner. The model is evaluated in a hybrid environment using a prototype implementation and simulation experiments with outcomes that demonstrate advantages over traditional approaches for a tamper-proof and immutable data pipeline for data authenticity and integrity using a confidential ledger.Comment: 13 page

    Graph representation learning for security analytics in decentralized software systems and social networks

    Get PDF
    With the rapid advancement in digital transformation, various daily interactions, transactions, and operations typically depend on extensive network-structured systems. The inherent complexity of these platforms has become a critical challenge in ensuring their security and robustness, with impacts spanning individual users to large-scale organizations. Graph representation learning has emerged as a potential methodology to address various security analytics within these complex systems, especially in software code and social network analysis, and its applications in criminology. For software code, graph representations can capture the information of control-flow graphs and call graphs, which can be leveraged to detect vulnerabilities and improve software reliability. In the case of social network analysis in criminal investigation, graph representations can capture the social connections and interactions between individuals, which can be used to identify key players, detect illegal activities, and predict new/unobserved criminal cases. In this thesis, we focus on two critical security topics using graph learning-based approaches: (1) addressing criminal investigation issues and (2) detecting vulnerabilities of Ethereum blockchain smart contracts. First, we propose the SoChainDB database, which facilitates obtaining data from blockchain-based social networks and conducting extensive analyses to understand Hive blockchain social data. Moreover, to apply social network analysis in criminal investigation, two graph-based machine learning frameworks are presented to address investigation issues in a burglary use case, one being transductive link prediction and the other being inductive link prediction.Then, we propose MANDO, an approach that utilizes a new heterogeneous graph representation of control-flow graphs and call graphs to learn the structures of heterogeneous contract graphs. Building upon MANDO, two deep graph learning-based frameworks, MANDO-GURU and MANDO-HGT, are proposed for accurate vulnerability detection at both the coarse-grained contract and fine-grained line levels. Empirical results show that MANDO frameworks significantly improve the detection accuracy of other state-of-the-art techniques for various vulnerability types in either source code or bytecode
    • …
    corecore