3,664 research outputs found
Bayesian neural network learning for repeat purchase modelling in direct marketing.
We focus on purchase incidence modelling for a European direct mail company. Response models based on statistical and neural network techniques are contrasted. The evidence framework of MacKay is used as an example implementation of Bayesian neural network learning, a method that is fairly robust with respect to problems typically encountered when implementing neural networks. The automatic relevance determination (ARD) method, an integrated feature of this framework, allows to assess the relative importance of the inputs. The basic response models use operationalisations of the traditionally discussed Recency, Frequency and Monetary (RFM) predictor categories. In a second experiment, the RFM response framework is enriched by the inclusion of other (non-RFM) customer profiling predictors. We contribute to the literature by providing experimental evidence that: (1) Bayesian neural networks offer a viable alternative for purchase incidence modelling; (2) a combined use of all three RFM predictor categories is advocated by the ARD method; (3) the inclusion of non-RFM variables allows to significantly augment the predictive power of the constructed RFM classifiers; (4) this rise is mainly attributed to the inclusion of customer\slash company interaction variables and a variable measuring whether a customer uses the credit facilities of the direct mailing company.Marketing; Companies; Models; Model; Problems; Neural networks; Networks; Variables; Credit;
Forecasting foreign exchange rates with adaptive neural networks using radial basis functions and particle swarm optimization
The motivation for this paper is to introduce a hybrid Neural Network architecture of Particle
Swarm Optimization and Adaptive Radial Basis Function (ARBF-PSO), a time varying leverage
trading strategy based on Glosten, Jagannathan and Runkle (GJR) volatility forecasts and a
Neural Network fitness function for financial forecasting purposes. This is done by
benchmarking the ARBF-PSO results with those of three different Neural Networks
architectures, a Nearest Neighbors algorithm (k-NN), an autoregressive moving average model
(ARMA), a moving average convergence/divergence model (MACD) plus a naïve strategy.
More specifically, the trading and statistical performance of all models is investigated in a
forecast simulation of the EUR/USD, EUR/GBP and EUR/JPY ECB exchange rate fixing time
series over the period January 1999 to March 2011 using the last two years for out-of-sample
testing
Improving malware detection with neuroevolution : a study with the semantic learning machine
Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceMachine learning has become more attractive over the years due to its remarkable adaptation and
problem-solving abilities. Algorithms compete amongst each other to claim the best possible results
for every problem, being one of the most valued characteristics their generalization ability.
A recently proposed methodology of Genetic Programming (GP), called Geometric Semantic Genetic
Programming (GSGP), has seen its popularity rise over the last few years, achieving great results
compared to other state-of-the-art algorithms, due to its remarkable feature of inducing a fitness
landscape with no local optima solutions. To any supervised learning problem, where a metric is used
as an error function, GSGP’s landscape will be unimodal, therefore allowing for genetic algorithms to
behave much more efficiently and effectively.
Inspired by GSGP’s features, Gonçalves developed a new mutation operator to be applied to the Neural
Networks (NN) domain, creating the Semantic Learning Machine (SLM). Despite GSGP’s good results
already proven, there are still research opportunities for improvement, that need to be performed to
empirically prove GSGP as a state-of-the-art framework.
In this case, the study focused on applying SLM to NNs with multiple hidden layers and compare its
outputs to a very popular algorithm, Multilayer Perceptron (MLP), on a considerably large classification
dataset about Android malware. Findings proved that SLM, sharing common parametrization with
MLP, in order to have a fair comparison, is able to outperform it, with statistical significance
Curriculum Dropout
Dropout is a very effective way of regularizing neural networks.
Stochastically "dropping out" units with a certain probability discourages
over-specific co-adaptations of feature detectors, preventing overfitting and
improving network generalization. Besides, Dropout can be interpreted as an
approximate model aggregation technique, where an exponential number of smaller
networks are averaged in order to get a more powerful ensemble. In this paper,
we show that using a fixed dropout probability during training is a suboptimal
choice. We thus propose a time scheduling for the probability of retaining
neurons in the network. This induces an adaptive regularization scheme that
smoothly increases the difficulty of the optimization problem. This idea of
"starting easy" and adaptively increasing the difficulty of the learning
problem has its roots in curriculum learning and allows one to train better
models. Indeed, we prove that our optimization strategy implements a very
general curriculum scheme, by gradually adding noise to both the input and
intermediate feature representations within the network architecture.
Experiments on seven image classification datasets and different network
architectures show that our method, named Curriculum Dropout, frequently yields
to better generalization and, at worst, performs just as well as the standard
Dropout method.Comment: Accepted at ICCV (International Conference on Computer Vision) 201
Medical imaging analysis with artificial neural networks
Given that neural networks have been widely reported in the research community of medical imaging, we provide a focused literature survey on recent neural network developments in computer-aided diagnosis, medical image segmentation and edge detection towards visual content analysis, and medical image registration for its pre-processing and post-processing, with the aims of increasing awareness of how neural networks can be applied to these areas and to provide a foundation for further research and practical development. Representative techniques and algorithms are explained in detail to provide inspiring examples illustrating: (i) how a known neural network with fixed structure and training procedure could be applied to resolve a medical imaging problem; (ii) how medical images could be analysed, processed, and characterised by neural networks; and (iii) how neural networks could be expanded further to resolve problems relevant to medical imaging. In the concluding section, a highlight of comparisons among many neural network applications is included to provide a global view on computational intelligence with neural networks in medical imaging
Mixed Order Hyper-Networks for Function Approximation and Optimisation
Many systems take inputs, which can be measured and sometimes controlled, and outputs, which can also be measured and which depend on the inputs. Taking numerous measurements from such systems produces data, which may be used to either model the system with the goal of predicting the output associated with a given input (function approximation, or regression) or of finding the input settings required to produce a desired output (optimisation, or search). Approximating or optimising a function is central to the field of computational intelligence.
There are many existing methods for performing regression and optimisation based on samples of data but they all have limitations. Multi layer perceptrons (MLPs) are universal approximators, but they suffer from the black box problem, which means their structure and the function they implement is opaque to the user. They also suffer from a propensity to become trapped in local minima or large plateaux in the error function during learning. A regression method with a structure that allows models to be compared, human knowledge to be extracted, optimisation searches to be guided and model complexity to be controlled is desirable. This thesis presents such as method.
This thesis presents a single framework for both regression and optimisation: the mixed order hyper network (MOHN). A MOHN implements a function f:{-1,1}^n ->R to arbitrary precision. The structure of a MOHN makes the ways in which input variables interact to determine the function output explicit, which allows human insights and complexity control that are very difficult in neural networks with hidden units. The explicit structure representation also allows efficient algorithms for searching for an input pattern that leads to a desired output. A number of learning rules for estimating the weights based on a sample of data are presented along with a heuristic method for choosing which connections to include in a model. Several methods for searching a MOHN for inputs that lead to a desired output are compared.
Experiments compare a MOHN to an MLP on regression tasks. The MOHN is found to achieve a comparable level of accuracy to an MLP but suffers less from local minima in the error function and shows less variance across multiple training trials. It is also easier to interpret and combine from an ensemble. The trade-off between the fit of a model to its training data and that to an independent set of test data is shown to be easier to control in a MOHN than an MLP.
A MOHN is also compared to a number of existing optimisation methods including those using estimation of distribution algorithms, genetic algorithms and simulated annealing. The MOHN is able to find optimal solutions in far fewer function evaluations than these methods on tasks selected from the literature
Sparse tree-based initialization for neural networks
Dedicated neural network (NN) architectures have been designed to handle
specific data types (such as CNN for images or RNN for text), which ranks them
among state-of-the-art methods for dealing with these data. Unfortunately, no
architecture has been found for dealing with tabular data yet, for which tree
ensemble methods (tree boosting, random forests) usually show the best
predictive performances. In this work, we propose a new sparse initialization
technique for (potentially deep) multilayer perceptrons (MLP): we first train a
tree-based procedure to detect feature interactions and use the resulting
information to initialize the network, which is subsequently trained via
standard stochastic gradient strategies. Numerical experiments on several
tabular data sets show that this new, simple and easy-to-use method is a solid
concurrent, both in terms of generalization capacity and computation time, to
default MLP initialization and even to existing complex deep learning
solutions. In fact, this wise MLP initialization raises the resulting NN
methods to the level of a valid competitor to gradient boosting when dealing
with tabular data. Besides, such initializations are able to preserve the
sparsity of weights introduced in the first layers of the network through
training. This fact suggests that this new initializer operates an implicit
regularization during the NN training, and emphasizes that the first layers act
as a sparse feature extractor (as for convolutional layers in CNN)
Representation of Functional Data in Neural Networks
Functional Data Analysis (FDA) is an extension of traditional data analysis
to functional data, for example spectra, temporal series, spatio-temporal
images, gesture recognition data, etc. Functional data are rarely known in
practice; usually a regular or irregular sampling is known. For this reason,
some processing is needed in order to benefit from the smooth character of
functional data in the analysis methods. This paper shows how to extend the
Radial-Basis Function Networks (RBFN) and Multi-Layer Perceptron (MLP) models
to functional data inputs, in particular when the latter are known through
lists of input-output pairs. Various possibilities for functional processing
are discussed, including the projection on smooth bases, Functional Principal
Component Analysis, functional centering and reduction, and the use of
differential operators. It is shown how to incorporate these functional
processing into the RBFN and MLP models. The functional approach is illustrated
on a benchmark of spectrometric data analysis.Comment: Also available online from:
http://www.sciencedirect.com/science/journal/0925231
- …