29,508 research outputs found

    Ensemble Classifiers and Their Applications: A Review

    Full text link
    Ensemble classifier refers to a group of individual classifiers that are cooperatively trained on data set in a supervised classification problem. In this paper we present a review of commonly used ensemble classifiers in the literature. Some ensemble classifiers are also developed targeting specific applications. We also present some application driven ensemble classifiers in this paper.Comment: Published with International Journal of Computer Trends and Technology (IJCTT

    Predictive Modeling with Delayed Information: a Case Study in E-commerce Transaction Fraud Control

    Full text link
    In Business Intelligence, accurate predictive modeling is the key for providing adaptive decisions. We studied predictive modeling problems in this research which was motivated by real-world cases that Microsoft data scientists encountered while dealing with e-commerce transaction fraud control decisions using transaction streaming data in an uncertain probabilistic decision environment. The values of most online transactions related features can return instantly, while the true fraud labels only return after a stochastic delay. Using partially mature data directly for predictive modeling in an uncertain probabilistic decision environment would lead to significant inaccuracy on risk decision-making. To improve accurate estimation of the probabilistic prediction environment, which leads to more accurate predictive modeling, two frameworks, Current Environment Inference (CEI) and Future Environment Inference (FEI), are proposed. These frameworks generated decision environment related features using long-term fully mature and short-term partially mature data, and the values of those features were estimated using varies of learning methods, including linear regression, random forest, gradient boosted tree, artificial neural network, and recurrent neural network. Performance tests were conducted using some e-commerce transaction data from Microsoft. Testing results suggested that proposed frameworks significantly improved the accuracy of decision environment estimation

    A Simple Dynamic Mind-map Framework To Discover Associative Relationships in Transactional Data Streams

    Full text link
    In this paper, we informally introduce dynamic mind-maps that represent a new approach on the basis of a dynamic construction of connectionist structures during the processing of a data stream. This allows the representation and processing of recursively defined structures and avoids the problem of a more traditional, fixed-size architecture with the processing of input structures of unknown size. For a data stream analysis with association discovery, the incremental analysis of data leads to results on demand. Here, we describe a framework that uses symbolic cells to calculate associations based on transactional data streams as it exists in e.g. bibliographic databases. We follow a natural paradigm of applying simple operations on cells yielding on a mind-map structure that adapts over time.Comment: 12 pages, 8 Figures. Updated version of a paper presented at the Workshop on Symbolic Networks, ECAI 2004, Valencia, Spai

    Intelligent Paging Strategy for Multi-Carrier CDMA System

    Full text link
    Subscriber satisfaction and maximum radio resource utilization are the pivotal criteria in communication system design. In multi-Carrier CDMA system, different paging algorithms are used for locating user within the shortest possible time and best possible utilization of radio resources. Different paging algorithms underscored different techniques based on the different purposes. However, low servicing time of sequential search and better utilization of radio resources of concurrent search can be utilized simultaneously by swapping of the algorithms. In this paper, intelligent mechanism has been developed for dynamic algorithm assignment basing on time-varying traffic demand, which is predicted by radial basis neural network; and its performance has been analyzed are based on prediction efficiency of different types of data. High prediction efficiency is observed with a good correlation coefficient (0.99) and subsequently better performance is achieved by dynamic paging algorithm assignment. This claim is substantiated by the result of proposed intelligent paging strategy

    A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem

    Full text link
    Financial portfolio management is the process of constant redistribution of a fund into different financial products. This paper presents a financial-model-free Reinforcement Learning framework to provide a deep machine learning solution to the portfolio management problem. The framework consists of the Ensemble of Identical Independent Evaluators (EIIE) topology, a Portfolio-Vector Memory (PVM), an Online Stochastic Batch Learning (OSBL) scheme, and a fully exploiting and explicit reward function. This framework is realized in three instants in this work with a Convolutional Neural Network (CNN), a basic Recurrent Neural Network (RNN), and a Long Short-Term Memory (LSTM). They are, along with a number of recently reviewed or published portfolio-selection strategies, examined in three back-test experiments with a trading period of 30 minutes in a cryptocurrency market. Cryptocurrencies are electronic and decentralized alternatives to government-issued money, with Bitcoin as the best-known example of a cryptocurrency. All three instances of the framework monopolize the top three positions in all experiments, outdistancing other compared trading algorithms. Although with a high commission rate of 0.25% in the backtests, the framework is able to achieve at least 4-fold returns in 50 days.Comment: 30 pages, 5 figures, submitting to JML

    Financial Time Series Prediction Using Deep Learning

    Full text link
    In this work we present a data-driven end-to-end Deep Learning approach for time series prediction, applied to financial time series. A Deep Learning scheme is derived to predict the temporal trends of stocks and ETFs in NYSE or NASDAQ. Our approach is based on a neural network (NN) that is applied to raw financial data inputs, and is trained to predict the temporal trends of stocks and ETFs. In order to handle commission-based trading, we derive an investment strategy that utilizes the probabilistic outputs of the NN, and optimizes the average return. The proposed scheme is shown to provide statistically significant accurate predictions of financial market trends, and the investment strategy is shown to be profitable under this challenging setup. The performance compares favorably with contemporary benchmarks along two-years of back-testing

    Characterizing Entities in the Bitcoin Blockchain

    Full text link
    Bitcoin has created a new exchange paradigm within which financial transactions can be trusted without an intermediary. This premise of a free decentralized transactional network however requires, in its current implementation, unrestricted access to the ledger for peer-based transaction verification. A number of studies have shown that, in this pseudonymous context, identities can be leaked based on transaction features or off-network information. In this work, we analyze the information revealed by the pattern of transactions in the neighborhood of a given entity transaction. By definition, these features which pertain to an extended network are not directly controllable by the entity, but might enable leakage of information about transacting entities. We define a number of new features relevant to entity characterization on the Bitcoin Blockchain and study their efficacy in practice. We show that even a weak attacker with shallow data mining knowledge is able to leverage these features to characterize the entity properties

    Scalable Graph Learning for Anti-Money Laundering: A First Look

    Full text link
    Organized crime inflicts human suffering on a genocidal scale: the Mexican drug cartels have murdered 150,000 people since 2006, upwards of 700,000 people per year are "exported" in a human trafficking industry enslaving an estimated 40 million people. These nefarious industries rely on sophisticated money laundering schemes to operate. Despite tremendous resources dedicated to anti-money laundering (AML) only a tiny fraction of illicit activity is prevented. The research community can help. In this brief paper, we map the structural and behavioral dynamics driving the technical challenge. We review AML methods, current and emergent. We provide a first look at scalable graph convolutional neural networks for forensic analysis of financial data, which is massive, dense, and dynamic. We report preliminary experimental results using a large synthetic graph (1M nodes, 9M edges) generated by a data simulator we created called AMLSim. We consider opportunities for high performance efficiency, in terms of computation and memory, and we share results from a simple graph compression experiment. Our results support our working hypothesis that graph deep learning for AML bears great promise in the fight against criminal financial activity.Comment: NeurIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy, Montreal, Canad

    Probabilistic Semantic Web Mining Using Artificial Neural Analysis

    Full text link
    Most of the web user's requirements are search or navigation time and getting correctly matched result. These constrains can be satisfied with some additional modules attached to the existing search engines and web servers. This paper proposes that powerful architecture for search engines with the title of Probabilistic Semantic Web Mining named from the methods used. With the increase of larger and larger collection of various data resources on the World Wide Web (WWW), Web Mining has become one of the most important requirements for the web users. Web servers will store various formats of data including text, image, audio, video etc., but servers can not identify the contents of the data. These search techniques can be improved by adding some special techniques including semantic web mining and probabilistic analysis to get more accurate results. Semantic web mining technique can provide meaningful search of data resources by eliminating useless information with mining process. In this technique web servers will maintain Meta information of each and every data resources available in that particular web server. This will help the search engine to retrieve information that is relevant to user given input string. This paper proposing the idea of combing these two techniques Semantic web mining and Probabilistic analysis for efficient and accurate search results of web mining. SPF can be calculated by considering both semantic accuracy and syntactic accuracy of data with the input string. This will be the deciding factor for producing results.Comment: IEEE Publication format, ISSN 1947 5500, http://sites.google.com/site/ijcsis

    Shared Predictive Cross-Modal Deep Quantization

    Full text link
    With explosive growth of data volume and ever-increasing diversity of data modalities, cross-modal similarity search, which conducts nearest neighbor search across different modalities, has been attracting increasing interest. This paper presents a deep compact code learning solution for efficient cross-modal similarity search. Many recent studies have proven that quantization-based approaches perform generally better than hashing-based approaches on single-modal similarity search. In this paper, we propose a deep quantization approach, which is among the early attempts of leveraging deep neural networks into quantization-based cross-modal similarity search. Our approach, dubbed shared predictive deep quantization (SPDQ), explicitly formulates a shared subspace across different modalities and two private subspaces for individual modalities, and representations in the shared subspace and the private subspaces are learned simultaneously by embedding them to a reproducing kernel Hilbert space, where the mean embedding of different modality distributions can be explicitly compared. In addition, in the shared subspace, a quantizer is learned to produce the semantics preserving compact codes with the help of label alignment. Thanks to this novel network architecture in cooperation with supervised quantization training, SPDQ can preserve intramodal and intermodal similarities as much as possible and greatly reduce quantization error. Experiments on two popular benchmarks corroborate that our approach outperforms state-of-the-art methods
    • …
    corecore