29,508 research outputs found
Ensemble Classifiers and Their Applications: A Review
Ensemble classifier refers to a group of individual classifiers that are
cooperatively trained on data set in a supervised classification problem. In
this paper we present a review of commonly used ensemble classifiers in the
literature. Some ensemble classifiers are also developed targeting specific
applications. We also present some application driven ensemble classifiers in
this paper.Comment: Published with International Journal of Computer Trends and
Technology (IJCTT
Predictive Modeling with Delayed Information: a Case Study in E-commerce Transaction Fraud Control
In Business Intelligence, accurate predictive modeling is the key for
providing adaptive decisions. We studied predictive modeling problems in this
research which was motivated by real-world cases that Microsoft data scientists
encountered while dealing with e-commerce transaction fraud control decisions
using transaction streaming data in an uncertain probabilistic decision
environment. The values of most online transactions related features can return
instantly, while the true fraud labels only return after a stochastic delay.
Using partially mature data directly for predictive modeling in an uncertain
probabilistic decision environment would lead to significant inaccuracy on risk
decision-making. To improve accurate estimation of the probabilistic prediction
environment, which leads to more accurate predictive modeling, two frameworks,
Current Environment Inference (CEI) and Future Environment Inference (FEI), are
proposed. These frameworks generated decision environment related features
using long-term fully mature and short-term partially mature data, and the
values of those features were estimated using varies of learning methods,
including linear regression, random forest, gradient boosted tree, artificial
neural network, and recurrent neural network. Performance tests were conducted
using some e-commerce transaction data from Microsoft. Testing results
suggested that proposed frameworks significantly improved the accuracy of
decision environment estimation
A Simple Dynamic Mind-map Framework To Discover Associative Relationships in Transactional Data Streams
In this paper, we informally introduce dynamic mind-maps that represent a new
approach on the basis of a dynamic construction of connectionist structures
during the processing of a data stream. This allows the representation and
processing of recursively defined structures and avoids the problem of a more
traditional, fixed-size architecture with the processing of input structures of
unknown size. For a data stream analysis with association discovery, the
incremental analysis of data leads to results on demand. Here, we describe a
framework that uses symbolic cells to calculate associations based on
transactional data streams as it exists in e.g. bibliographic databases. We
follow a natural paradigm of applying simple operations on cells yielding on a
mind-map structure that adapts over time.Comment: 12 pages, 8 Figures. Updated version of a paper presented at the
Workshop on Symbolic Networks, ECAI 2004, Valencia, Spai
Intelligent Paging Strategy for Multi-Carrier CDMA System
Subscriber satisfaction and maximum radio resource utilization are the
pivotal criteria in communication system design. In multi-Carrier CDMA system,
different paging algorithms are used for locating user within the shortest
possible time and best possible utilization of radio resources. Different
paging algorithms underscored different techniques based on the different
purposes. However, low servicing time of sequential search and better
utilization of radio resources of concurrent search can be utilized
simultaneously by swapping of the algorithms. In this paper, intelligent
mechanism has been developed for dynamic algorithm assignment basing on
time-varying traffic demand, which is predicted by radial basis neural network;
and its performance has been analyzed are based on prediction efficiency of
different types of data. High prediction efficiency is observed with a good
correlation coefficient (0.99) and subsequently better performance is achieved
by dynamic paging algorithm assignment. This claim is substantiated by the
result of proposed intelligent paging strategy
A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem
Financial portfolio management is the process of constant redistribution of a
fund into different financial products. This paper presents a
financial-model-free Reinforcement Learning framework to provide a deep machine
learning solution to the portfolio management problem. The framework consists
of the Ensemble of Identical Independent Evaluators (EIIE) topology, a
Portfolio-Vector Memory (PVM), an Online Stochastic Batch Learning (OSBL)
scheme, and a fully exploiting and explicit reward function. This framework is
realized in three instants in this work with a Convolutional Neural Network
(CNN), a basic Recurrent Neural Network (RNN), and a Long Short-Term Memory
(LSTM). They are, along with a number of recently reviewed or published
portfolio-selection strategies, examined in three back-test experiments with a
trading period of 30 minutes in a cryptocurrency market. Cryptocurrencies are
electronic and decentralized alternatives to government-issued money, with
Bitcoin as the best-known example of a cryptocurrency. All three instances of
the framework monopolize the top three positions in all experiments,
outdistancing other compared trading algorithms. Although with a high
commission rate of 0.25% in the backtests, the framework is able to achieve at
least 4-fold returns in 50 days.Comment: 30 pages, 5 figures, submitting to JML
Financial Time Series Prediction Using Deep Learning
In this work we present a data-driven end-to-end Deep Learning approach for
time series prediction, applied to financial time series. A Deep Learning
scheme is derived to predict the temporal trends of stocks and ETFs in NYSE or
NASDAQ. Our approach is based on a neural network (NN) that is applied to raw
financial data inputs, and is trained to predict the temporal trends of stocks
and ETFs. In order to handle commission-based trading, we derive an investment
strategy that utilizes the probabilistic outputs of the NN, and optimizes the
average return. The proposed scheme is shown to provide statistically
significant accurate predictions of financial market trends, and the investment
strategy is shown to be profitable under this challenging setup. The
performance compares favorably with contemporary benchmarks along two-years of
back-testing
Characterizing Entities in the Bitcoin Blockchain
Bitcoin has created a new exchange paradigm within which financial
transactions can be trusted without an intermediary. This premise of a free
decentralized transactional network however requires, in its current
implementation, unrestricted access to the ledger for peer-based transaction
verification. A number of studies have shown that, in this pseudonymous
context, identities can be leaked based on transaction features or off-network
information. In this work, we analyze the information revealed by the pattern
of transactions in the neighborhood of a given entity transaction. By
definition, these features which pertain to an extended network are not
directly controllable by the entity, but might enable leakage of information
about transacting entities. We define a number of new features relevant to
entity characterization on the Bitcoin Blockchain and study their efficacy in
practice. We show that even a weak attacker with shallow data mining knowledge
is able to leverage these features to characterize the entity properties
Scalable Graph Learning for Anti-Money Laundering: A First Look
Organized crime inflicts human suffering on a genocidal scale: the Mexican
drug cartels have murdered 150,000 people since 2006, upwards of 700,000 people
per year are "exported" in a human trafficking industry enslaving an estimated
40 million people. These nefarious industries rely on sophisticated money
laundering schemes to operate. Despite tremendous resources dedicated to
anti-money laundering (AML) only a tiny fraction of illicit activity is
prevented. The research community can help. In this brief paper, we map the
structural and behavioral dynamics driving the technical challenge. We review
AML methods, current and emergent. We provide a first look at scalable graph
convolutional neural networks for forensic analysis of financial data, which is
massive, dense, and dynamic. We report preliminary experimental results using a
large synthetic graph (1M nodes, 9M edges) generated by a data simulator we
created called AMLSim. We consider opportunities for high performance
efficiency, in terms of computation and memory, and we share results from a
simple graph compression experiment. Our results support our working hypothesis
that graph deep learning for AML bears great promise in the fight against
criminal financial activity.Comment: NeurIPS 2018 Workshop on Challenges and Opportunities for AI in
Financial Services: the Impact of Fairness, Explainability, Accuracy, and
Privacy, Montreal, Canad
Probabilistic Semantic Web Mining Using Artificial Neural Analysis
Most of the web user's requirements are search or navigation time and getting
correctly matched result. These constrains can be satisfied with some
additional modules attached to the existing search engines and web servers.
This paper proposes that powerful architecture for search engines with the
title of Probabilistic Semantic Web Mining named from the methods used. With
the increase of larger and larger collection of various data resources on the
World Wide Web (WWW), Web Mining has become one of the most important
requirements for the web users. Web servers will store various formats of data
including text, image, audio, video etc., but servers can not identify the
contents of the data. These search techniques can be improved by adding some
special techniques including semantic web mining and probabilistic analysis to
get more accurate results. Semantic web mining technique can provide meaningful
search of data resources by eliminating useless information with mining
process. In this technique web servers will maintain Meta information of each
and every data resources available in that particular web server. This will
help the search engine to retrieve information that is relevant to user given
input string. This paper proposing the idea of combing these two techniques
Semantic web mining and Probabilistic analysis for efficient and accurate
search results of web mining. SPF can be calculated by considering both
semantic accuracy and syntactic accuracy of data with the input string. This
will be the deciding factor for producing results.Comment: IEEE Publication format, ISSN 1947 5500,
http://sites.google.com/site/ijcsis
Shared Predictive Cross-Modal Deep Quantization
With explosive growth of data volume and ever-increasing diversity of data
modalities, cross-modal similarity search, which conducts nearest neighbor
search across different modalities, has been attracting increasing interest.
This paper presents a deep compact code learning solution for efficient
cross-modal similarity search. Many recent studies have proven that
quantization-based approaches perform generally better than hashing-based
approaches on single-modal similarity search. In this paper, we propose a deep
quantization approach, which is among the early attempts of leveraging deep
neural networks into quantization-based cross-modal similarity search. Our
approach, dubbed shared predictive deep quantization (SPDQ), explicitly
formulates a shared subspace across different modalities and two private
subspaces for individual modalities, and representations in the shared subspace
and the private subspaces are learned simultaneously by embedding them to a
reproducing kernel Hilbert space, where the mean embedding of different
modality distributions can be explicitly compared. In addition, in the shared
subspace, a quantizer is learned to produce the semantics preserving compact
codes with the help of label alignment. Thanks to this novel network
architecture in cooperation with supervised quantization training, SPDQ can
preserve intramodal and intermodal similarities as much as possible and greatly
reduce quantization error. Experiments on two popular benchmarks corroborate
that our approach outperforms state-of-the-art methods
- …