Smart Learning to Find Dumb Contracts

Abstract

We introduce Deep Learning Vulnerability Analyzer (DLVA), a vulnerability detection tool for Ethereum smart contracts based on powerful deep learning techniques for sequential data adapted for bytecode. We train DLVA to judge bytecode even though the supervising oracle, Slither, can only judge source code. DLVA's training algorithm is general: we "extend" a source code analysis to bytecode without any manual feature engineering, predefined patterns, or expert rules. DLVA's training algorithm is also robust: it overcame a 1.25% error rate mislabeled contracts, and the student surpassing the teacher; found vulnerable contracts that Slither mislabeled. In addition to extending a source code analyzer to bytecode, DLVA is much faster than conventional tools for smart contract vulnerability detection based on formal methods: DLVA checks contracts for 29 vulnerabilities in 0.2 seconds, a speedup of 10-500x+ compared to traditional tools. DLVA has three key components. Smart Contract to Vector (SC2V) uses neural networks to map arbitrary smart contract bytecode to an high-dimensional floating-point vector. Sibling Detector (SD) classifies contracts when a target contract's vector is Euclidian-close to a labeled contract's vector in a training set; although only able to judge 55.7% of the contracts in our test set, it has an average accuracy of 97.4% with a false positive rate of only 0.1%. Lastly, Core Classifier (CC) uses neural networks to infer vulnerable contracts regardless of vector distance. DLVA has an overall accuracy of 96.6% with an associated false positive rate of only 3.7%

    Similar works

    Full text

    thumbnail-image

    Available Versions