12 research outputs found

    Non-Negative Paratuck2 Tensor Decomposition Combined to LSTM Network for Smart Contracts Profiling

    Get PDF
    Background: Past few months have seen the rise of blockchain and cryptocurrencies. In this context, the Ethereum platform, an open-source blockchain-based platform using Ether cryptocurrency, has been designed to use smart contracts programs. These are self-executing blockchain contracts. Due to their high volume of transactions, analyzing their behavior is very challenging. We address this challenge in our paper. Methods: We develop for this purpose an innovative approach based on the non-negative tensor decomposition Paratuck2 combined with long short-term memory. The objective is to assess if predictive analysis can forecast smart contracts activities over time. Three statistical tests are performed on the predictive analytics, the mean absolute percentage error, the mean directional accuracy and the Jaccard distance. Results: Among dozens of GB of transactions, the Paratuck2 tensor decomposition allows asymmetric modeling of the smart contracts. Furthermore, it highlights time dependent latent groups. The latent activities are modeled by the long short term memory network for predictive analytics. The highly accurate predictions underline the accuracy of the method and show that blockchain activities are not pure randomness. Conclusion: Herein, we are able to detect the most active contracts, and predict their behavior. In the context of future regulations, our approach opens new perspective for monitoring blockchain activities

    Predicting Sparse Clients' Actions with CPOPT-Net in the Banking Environment

    Get PDF
    The digital revolution of the banking system with evolving European regulations have pushed the major banking actors to innovate by a newly use of their clients' digital information. Given highly sparse client activities, we propose CPOPT-Net, an algorithm that combines the CP canonical tensor decomposition, a multidimensional matrix decomposition that factorizes a tensor as the sum of rank-one tensors, and neural networks. CPOPT-Net removes efficiently sparse information with a gradient-based resolution while relying on neural networks for time series predictions. Our experiments show that CPOPT-Net is capable to perform accurate predictions of the clients' actions in the context of personalized recommendation. CPOPT-Net is the first algorithm to use non-linear conjugate gradient tensor resolution with neural networks to propose predictions of financial activities on a public data set

    PHom-GeM: Persistent Homology for Generative Models

    Get PDF
    Generative neural network models, including Generative Adversarial Network (GAN) and Auto-Encoders (AE), are among the most popular neural network models to generate adversarial data. The GAN model is composed of a generator that produces synthetic data and of a discriminator that discriminates between the generator’s output and the true data. AE consist of an encoder which maps the model distribution to a latent manifold and of a decoder which maps the latent manifold to a reconstructed distribution. However, generative models are known to provoke chaotically scattered reconstructed distribution during their training, and consequently, incomplete generated adversarial distributions. Current distance measures fail to address this problem because they are not able to acknowledge the shape of the data manifold, i.e. its topological features, and the scale at which the manifold should be analyzed. We propose Persistent Homology for Generative Models, PHom-GeM, a new methodology to assess and measure the distribution of a generative model. PHom-GeM minimizes an objective function between the true and the reconstructed distributions and uses persistent homology, the study of the topological features of a space at different spatial resolutions, to compare the nature of the true and the generated distributions. Our experiments underline the potential of persistent homology for Wasserstein GAN in comparison to Wasserstein AE and Variational AE. The experiments are conducted on a real-world data set particularly challenging for traditional distance measures and generative neural network models. PHom-GeM is the first methodology to propose a topological distance measure, the bottleneck distance, for generative models used to compare adversarial samples in the context of credit card transactions

    MQLV: Optimal Policy of Money Management in Retail Banking with Q-Learning

    Get PDF
    Reinforcement learning has become one of the best approach to train a computer game emulator capable of human level performance. In a reinforcement learning approach, an optimal value function is learned across a set of actions, or decisions, that leads to a set of states giving different rewards, with the objective to maximize the overall reward. A policy assigns to each state-action pairs an expected return. We call an optimal policy a policy for which the value function is optimal. QLBS, Q-Learner in the Black-Scholes(-Merton) Worlds, applies the reinforcement learning concepts, and noticeably, the popular Q-learning algorithm, to the financial stochastic model of Black, Scholes and Merton. It is, however, specifically optimized for the geometric Brownian motion and the vanilla options. Its range of application is, therefore, limited to vanilla option pricing within the financial markets. We propose MQLV, Modified Q-Learner for the Vasicek model, a new reinforcement learning approach that determines the optimal policy of money management based on the aggregated financial transactions of the clients. It unlocks new frontiers to establish personalized credit card limits or bank loan applications, targeting the retail banking industry. MQLV extends the simulation to mean reverting stochastic diffusion processes and it uses a digital function, a Heaviside step function expressed in its discrete form, to estimate the probability of a future event such as a payment default. In our experiments, we first show the similarities between a set of historical financial transactions and Vasicek generated transactions and, then, we underline the potential of MQLV on generated Monte Carlo simulations. Finally, MQLV is the first Q-learning Vasicek-based methodology addressing transparent decision making processes in retail banking

    Visualization of AE's Training on Credit Card Transactions with Persistent Homology

    Get PDF
    Auto-encoders are among the most popular neural network architecture for dimension reduction. They are composed of two parts: the encoder which maps the model distribution to a latent manifold and the decoder which maps the latent manifold to a reconstructed distribution. However, auto-encoders are known to provoke chaotically scattered data distribution in the latent manifold resulting in an incomplete reconstructed distribution. Current distance measures fail to detect this problem because they are not able to acknowledge the shape of the data manifolds, i.e. their topological features, and the scale at which the manifolds should be analyzed. We propose Persistent Homology for Wasserstein Auto-Encoders, called PHom-WAE, a new methodology to assess and measure the data distribution of a generative model. PHom-WAE minimizes the Wasserstein distance between the true distribution and the reconstructed distribution and uses persistent homology, the study of the topological features of a space at different spatial resolutions, to compare the nature of the latent manifold and the reconstructed distribution. Our experiments underline the potential of persistent homology for Wasserstein Auto-Encoders in comparison to Variational Auto-Encoders, another type of generative model. The experiments are conducted on a real-world data set particularly challenging for traditional distance measures and auto-encoders. PHom-WAE is the first methodology to propose a topological distance measure, the bottleneck distance, for Wasserstein Auto-Encoders used to compare decoded samples of high quality in the context of credit card transactions

    From Persistent Homology to Reinforcement Learning with Applications for Retail Banking

    Get PDF
    The retail banking services are one of the pillars of the modern economic growth. However, the evolution of the client’s habits in modern societies and the recent European regulations promoting more competition mean the retail banks will encounter serious challenges for the next few years, endangering their activities. They now face an impossible compromise: maximizing the satisfaction of their hyper-connected clients while avoiding any risk of default and being regulatory compliant. Therefore, advanced and novel research concepts are a serious game-changer to gain a competitive advantage. In this context, we investigate in this thesis different concepts bridging the gap between persistent homology, neural networks, recommender engines and reinforcement learning with the aim of improving the quality of the retail banking services. Our contribution is threefold. First, we highlight how to overcome insufficient financial data by generating artificial data using generative models and persistent homology. Then, we present how to perform accurate financial recommendations in multi-dimensions. Finally, we underline a reinforcement learning model-free approach to determine the optimal policy of money management based on the aggregated financial transactions of the clients. Our experimental data sets, extracted from well-known institutions where the privacy and the confidentiality of the clients were not put at risk, support our contributions. In this work, we provide the motivations of our retail banking research project, describe the theory employed to improve the financial services quality and evaluate quantitatively and qualitatively our methodologies for each of the proposed research scenarios

    Modeling Smart Contracts Activities: A Tensor based Approach

    No full text
    Smart contracts are autonomous software executing predefined conditions. Two of the biggest advantages of the smart contracts are secured protocols and transaction costs reduction. On the Ethereum platform, an open-source blockchainbased platform, smart contracts implement a distributed virtual machine on the distributed ledger. To avoid denial of service attacks and monetize the services, payment transactions are executed whenever code is being executed between contracts. It is thus natural to investigate if predictive analysis is capable to forecast these interactions. We have addressed this issue and proposed an innovative application of the tensor decomposition CANDECOMP/PARAFAC to the temporal link prediction of smart contracts. We introduce a new approach leveraging stochastic processes for series predictions based on the tensor decomposition that can be used for smart contracts predictive analytics

    Non-Negative Paratuck2 Tensor Decomposition Combined to LSTM Network For Smart Contracts Profiling

    No full text
    Smart contracts are programs stored and executed on a blockchain. The Ethereum platform, an open source blockchain-based platform, has been designed to use these programs offering secured protocols and transaction costs reduction. The Ethereum Virtual Machine performs smart contracts runs, where the execution of each contract is limited to the amount of gas required to execute the operations described in the code. Each gas unit must be paid using Ether, the crypto-currency of the platform. Due to smart contracts interactions evolving over time, analyzing the behavior of smart contracts is very challenging. We address this challenge in our paper. We develop for this purpose an innovative approach based on the nonnegative tensor decomposition PARATUCK2 combined with long short-term memory (LSTM) to assess if predictive analysis can forecast smart contracts interactions over time. To validate our methodology, we report results for two use cases. The main use case is related to analyzing smart contracts and allows shedding some light into the complex interactions among smart contracts. In order to show the generality of our method on other use cases, we also report its performance on video on demand recommendation

    Knowledge Discovery Approach from Blockchain, Crypto-currencies, and Financial Stock Exchanges

    No full text
    Last few years have witnessed a steady growth in interest on crypto-currencies and blockchains. They are receiving considerable interest from industry and the research community, the most popular one being Bitcoin. However, these crypto-currencies are so far relatively poorly analyzed and investigated. Recently, many solutions, mostly based on ad-hoc engineered solutions, are being developed to discover relevant analysis from crypto-currencies, but are not sufficient to understand behind crypto-currencies. In this paper, we provide a deep analysis of crypto-currencies by proposing a new knowledge discovery approach for each crypto-currency, across crypto-currencies, blockchains, and financial stocks. The novel approach is based on a conjoint use of data mining algorithms on imbalanced time series. It automatically reports co-variation dependency patterns of the time series. The experiments on the public crypto-currencies and financial stocks markets data also demonstrate the usefulness of the approach by discovering the different relationships across multiple time series sources and insights correlations behind crypto-currencies

    Your Moves, Your Device: Establishing Behavior Profiles Using Tensors

    No full text
    Smartphones became a person's constant companion. As the strictly personal devices they are, they gradually enable the replacement of well established activities as for instance payments, two factor authentication or personal assistants. In addition, Internet of Things (IoT) gadgets extend the capabilities of the latter even further. Devices such as body worn fitness trackers allow users to keep track of daily activities by periodically synchronizing data with the smartphone and ultimately with the vendor's computational centers in the cloud. These fitness trackers are equipped with an array of sensors to measure the movements of the device, to derive information as step counts or make assessments about sleep quality. We capture the raw sensor data from wrist-worn activity trackers to model a biometric behavior profile of the carrier. We establish and present techniques to determine rather the original person, who trained the model, is currently wearing the bracelet or another individual. Our contribution is based on CANDECOMP/PARAFAC (CP) tensor decomposition so that computational complexity facilitates: the execution on light computational devices on low precision settings, or the migration to stronger CPUs or to the cloud, for high to very high granularity. This precision parameter allows the security layer to be adaptable, in order to be compliant with the requirements set by the use cases. We show that our approach identifies users with high confidence
    corecore