11,480 research outputs found
Conditional probability generation methods for high reliability effects-based decision making
Decision making is often based on Bayesian networks. The building blocks for
Bayesian networks are its conditional probability tables (CPTs). These tables
are obtained by parameter estimation methods, or they are elicited from subject
matter experts (SME). Some of these knowledge representations are insufficient
approximations. Using knowledge fusion of cause and effect observations lead to
better predictive decisions. We propose three new methods to generate CPTs,
which even work when only soft evidence is provided. The first two are novel
ways of mapping conditional expectations to the probability space. The third is
a column extraction method, which obtains CPTs from nonlinear functions such as
the multinomial logistic regression. Case studies on military effects and burnt
forest desertification have demonstrated that so derived CPTs have highly
reliable predictive power, including superiority over the CPTs obtained from
SMEs. In this context, new quality measures for determining the goodness of a
CPT and for comparing CPTs with each other have been introduced. The predictive
power and enhanced reliability of decision making based on the novel CPT
generation methods presented in this paper have been confirmed and validated
within the context of the case studies.Comment: 18 pages, 3 figure
Semi-supervised logistic discrimination for functional data
Multi-class classification methods based on both labeled and unlabeled
functional data sets are discussed. We present a semi-supervised logistic model
for classification in the context of functional data analysis. Unknown
parameters in our proposed model are estimated by regularization with the help
of EM algorithm. A crucial point in the modeling procedure is the choice of a
regularization parameter involved in the semi-supervised functional logistic
model. In order to select the adjusted parameter, we introduce model selection
criteria from information-theoretic and Bayesian viewpoints. Monte Carlo
simulations and a real data analysis are given to examine the effectiveness of
our proposed modeling strategy.Comment: 21 pages, 7 figure
ADMM-SOFTMAX : An ADMM Approach for Multinomial Logistic Regression
We present ADMM-Softmax, an alternating direction method of multipliers
(ADMM) for solving multinomial logistic regression (MLR) problems. Our method
is geared toward supervised classification tasks with many examples and
features. It decouples the nonlinear optimization problem in MLR into three
steps that can be solved efficiently. In particular, each iteration of
ADMM-Softmax consists of a linear least-squares problem, a set of independent
small-scale smooth, convex problems, and a trivial dual variable update.
Solution of the least-squares problem can be be accelerated by pre-computing a
factorization or preconditioner, and the separability in the smooth, convex
problem can be easily parallelized across examples. For two image
classification problems, we demonstrate that ADMM-Softmax leads to improved
generalization compared to a Newton-Krylov, a quasi Newton, and a stochastic
gradient descent method
Neural Networks for Target Selection in Direct Marketing
Partly due to a growing interest in direct marketing, it has become an important application field for data mining. Many techniques have been applied to select the targets in commercial applications, such as statistical regression, regression trees, neural computing, fuzzy clustering and association rules. Modeling of charity donations has also recently been considered. The availability of a large number of techniques for analyzing the data may look overwhelming and ultimately unnecessary at first. However, the amount of data used in direct marketing is tremendous. Further, there are different types of data and likely strong nonlinear relations amongst different groups within the data. Therefore, it is unlikely that there will be a single method that can be used under all circumstances. For that reason, it is important to have access to a range of different target selection methods that can be used in a complementary fashion. In this respect, learning systems such as neural networks have the advantage that they can adapt to the nonlinearity in the data to capture the complex relations. This is an important motivation for applying neural networks for target selection. In this report, neural networks are applied to target selection in modeling of charity donations. Various stages of model building are described by using data from a large Dutch charity organization as a case. The results are compared with the results of more traditional methods for target selection such as logistic regression and CHAID.neural networks;data mining;direct mail;direct marketing;target selection
A Selective Overview of Deep Learning
Deep learning has arguably achieved tremendous success in recent years. In
simple words, deep learning uses the composition of many nonlinear functions to
model the complex dependency between input features and labels. While neural
networks have a long history, recent advances have greatly improved their
performance in computer vision, natural language processing, etc. From the
statistical and scientific perspective, it is natural to ask: What is deep
learning? What are the new characteristics of deep learning, compared with
classical methods? What are the theoretical foundations of deep learning? To
answer these questions, we introduce common neural network models (e.g.,
convolutional neural nets, recurrent neural nets, generative adversarial nets)
and training techniques (e.g., stochastic gradient descent, dropout, batch
normalization) from a statistical point of view. Along the way, we highlight
new characteristics of deep learning (including depth and over-parametrization)
and explain their practical and theoretical benefits. We also sample recent
results on theories of deep learning, many of which are only suggestive. While
a complete understanding of deep learning remains elusive, we hope that our
perspectives and discussions serve as a stimulus for new statistical research
Leveraging Product as an Activation Function in Deep Networks
Product unit neural networks (PUNNs) are powerful representational models
with a strong theoretical basis, but have proven to be difficult to train with
gradient-based optimizers. We present windowed product unit neural networks
(WPUNNs), a simple method of leveraging product as a nonlinearity in a neural
network. Windowing the product tames the complex gradient surface and enables
WPUNNs to learn effectively, solving the problems faced by PUNNs. WPUNNs use
product layers between traditional sum layers, capturing the representational
power of product units and using the product itself as a nonlinearity. We find
the result that this method works as well as traditional nonlinearities like
ReLU on the MNIST dataset. We demonstrate that WPUNNs can also generalize gated
units in recurrent neural networks, yielding results comparable to LSTM
networks.Comment: 6 pages, 3 figures, IEEE SMC 201
Machine learning based hyperspectral image analysis: A survey
Hyperspectral sensors enable the study of the chemical properties of scene
materials remotely for the purpose of identification, detection, and chemical
composition analysis of objects in the environment. Hence, hyperspectral images
captured from earth observing satellites and aircraft have been increasingly
important in agriculture, environmental monitoring, urban planning, mining, and
defense. Machine learning algorithms due to their outstanding predictive power
have become a key tool for modern hyperspectral image analysis. Therefore, a
solid understanding of machine learning techniques have become essential for
remote sensing researchers and practitioners. This paper reviews and compares
recent machine learning-based hyperspectral image analysis methods published in
literature. We organize the methods by the image analysis task and by the type
of machine learning algorithm, and present a two-way mapping between the image
analysis tasks and the types of machine learning algorithms that can be applied
to them. The paper is comprehensive in coverage of both hyperspectral image
analysis tasks and machine learning algorithms. The image analysis tasks
considered are land cover classification, target detection, unmixing, and
physical parameter estimation. The machine learning algorithms covered are
Gaussian models, linear regression, logistic regression, support vector
machines, Gaussian mixture model, latent linear models, sparse linear models,
Gaussian mixture models, ensemble learning, directed graphical models,
undirected graphical models, clustering, Gaussian processes, Dirichlet
processes, and deep learning. We also discuss the open challenges in the field
of hyperspectral image analysis and explore possible future directions
EndNet: Sparse AutoEncoder Network for Endmember Extraction and Hyperspectral Unmixing
Data acquired from multi-channel sensors is a highly valuable asset to
interpret the environment for a variety of remote sensing applications.
However, low spatial resolution is a critical limitation for previous sensors
and the constituent materials of a scene can be mixed in different fractions
due to their spatial interactions. Spectral unmixing is a technique that allows
us to obtain the material spectral signatures and their fractions from
hyperspectral data. In this paper, we propose a novel endmember extraction and
hyperspectral unmixing scheme, so called \textit{EndNet}, that is based on a
two-staged autoencoder network. This well-known structure is completely
enhanced and restructured by introducing additional layers and a projection
metric (i.e., spectral angle distance (SAD) instead of inner product) to
achieve an optimum solution. Moreover, we present a novel loss function that is
composed of a Kullback-Leibler divergence term with SAD similarity and
additional penalty terms to improve the sparsity of the estimates. These
modifications enable us to set the common properties of endmembers such as
non-linearity and sparsity for autoencoder networks. Lastly, due to the
stochastic-gradient based approach, the method is scalable for large-scale data
and it can be accelerated on Graphical Processing Units (GPUs). To demonstrate
the superiority of our proposed method, we conduct extensive experiments on
several well-known datasets. The results confirm that the proposed method
considerably improves the performance compared to the state-of-the-art
techniques in literature.Comment: To appear in IEEE Transaction on Geoscience and Remote Sensin
A Deep Learning and Gamification Approach to Energy Conservation at Nanyang Technological University
The implementation of smart building technology in the form of smart
infrastructure applications has great potential to improve sustainability and
energy efficiency by leveraging humans-in-the-loop strategy. However, human
preference in regard to living conditions is usually unknown and heterogeneous
in its manifestation as control inputs to a building. Furthermore, the
occupants of a building typically lack the independent motivation necessary to
contribute to and play a key role in the control of smart building
infrastructure. Moreover, true human actions and their integration with
sensing/actuation platforms remains unknown to the decision maker tasked with
improving operational efficiency. By modeling user interaction as a sequential
discrete game between non-cooperative players, we introduce a gamification
approach for supporting user engagement and integration in a human-centric
cyber-physical system. We propose the design and implementation of a
large-scale network game with the goal of improving the energy efficiency of a
building through the utilization of cutting-edge Internet of Things (IoT)
sensors and cyber-physical systems sensing/actuation platforms. A benchmark
utility learning framework that employs robust estimations for classical
discrete choice models provided for the derived high dimensional imbalanced
data. To improve forecasting performance, we extend the benchmark utility
learning scheme by leveraging Deep Learning end-to-end training with Deep
bi-directional Recurrent Neural Networks. We apply the proposed methods to high
dimensional data from a social game experiment designed to encourage energy
efficient behavior among smart building occupants in Nanyang Technological
University (NTU) residential housing. Using occupant-retrieved actions for
resources such as lighting and A/C, we simulate the game defined by the
estimated utility functions.Comment: 16 double pages, shorter version submitted to Applied Energy Journa
EEG machine learning with Higuchi fractal dimension and Sample Entropy as features for successful detection of depression
Reliable diagnosis of depressive disorder is essential for both optimal
treatment and prevention of fatal outcomes. In this study, we aimed to
elucidate the effectiveness of two non-linear measures, Higuchi Fractal
Dimension (HFD) and Sample Entropy (SampEn), in detecting depressive disorders
when applied on EEG. HFD and SampEn of EEG signals were used as features for
seven machine learning algorithms including Multilayer Perceptron, Logistic
Regression, Support Vector Machines with the linear and polynomial kernel,
Decision Tree, Random Forest, and Naive Bayes classifier, discriminating EEG
between healthy control subjects and patients diagnosed with depression. We
confirmed earlier observations that both non-linear measures can discriminate
EEG signals of patients from healthy control subjects. The results suggest that
good classification is possible even with a small number of principal
components. Average accuracy among classifiers ranged from 90.24% to 97.56%.
Among the two measures, SampEn had better performance. Using HFD and SampEn and
a variety of machine learning techniques we can accurately discriminate
patients diagnosed with depression vs controls which can serve as a highly
sensitive, clinically relevant marker for the diagnosis of depressive
disorders.Comment: 34 pages, 4 Figures, 2 table
- …