62,231 research outputs found
Advanced Stochastic Optimization Algorithm for Deep Learning Artificial Neural Networks in Banking and Finance Industries
One of the objectives of this paper is to incorporate fat-tail effects into, for instance, Sigmoid in order to introduce Transparency and Stability into the existing stochastic Activation Functions. Secondly, according to the available literature reviewed, the existing set of Activation Functions were introduced into the Deep learning Artificial Neural Network through the “Window” not properly through the “Legitimate Door” since they are “Trial and Error “and “Arbitrary Assumptions”, thus, the Author proposed a “Scientific Facts”, “Definite Rules: Jameel’s Stochastic ANNAF Criterion”, and a “Lemma” to substitute not necessarily replace the existing set of stochastic Activation Functions, for instance, the Sigmoid among others. This research is expected to open the “Black-Box” of Deep Learning Artificial Neural networks. The author proposed a new set of advanced optimized fat-tailed Stochastic Activation Functions EMANATED from the AI-ML-Purified Stocks Data namely; the Log – Logistic (3P) Probability Distribution (1st), Cauchy Probability Distribution (2nd), Pearson 5 (3P) Probability Distribution (3rd), Burr (4P) Probability Distribution (4th), Fatigue Life (3P) Probability Distribution (5th), Inv. Gaussian (3P) Probability Distribution (6th), Dagum (4P) Probability Distribution (7th), and Lognormal (3P) Probability Distribution (8th) for the successful conduct of both Forward and Backward Propagations of Deep Learning Artificial Neural Network. However, this paper did not check the Monotone Differentiability of the proposed distributions. Appendix A, B, and C presented and tested the performances of the stressed Sigmoid and the Optimized Activation Functions using Stocks Data (2014-1991) of Microsoft Corporation (MSFT), Exxon Mobil (XOM), Chevron Corporation (CVX), Honda Motor Corporation (HMC), General Electric (GE), and U.S. Fundamental Macroeconomic Parameters, the results were found fascinating. Thus, guarantee, the first three distributions are excellent Activation Functions to successfully conduct any Stock Deep Learning Artificial Neural Network. Distributions Number 4 to 8 are also good Advanced Optimized Activation Functions. Generally, this research revealed that the Advanced Optimized Activation Functions satisfied Jameel’s ANNAF Stochastic Criterion depends on the Referenced Purified AI Data Set, Time Change and Area of Application which is against the existing “Trial and Error “and “Arbitrary Assumptions” of Sigmoid, Tanh, Softmax, ReLu, and Leaky ReLu
Stochastic Reinforcement Learning
In reinforcement learning episodes, the rewards and punishments are often
non-deterministic, and there are invariably stochastic elements governing the
underlying situation. Such stochastic elements are often numerous and cannot be
known in advance, and they have a tendency to obscure the underlying rewards
and punishments patterns. Indeed, if stochastic elements were absent, the same
outcome would occur every time and the learning problems involved could be
greatly simplified. In addition, in most practical situations, the cost of an
observation to receive either a reward or punishment can be significant, and
one would wish to arrive at the correct learning conclusion by incurring
minimum cost. In this paper, we present a stochastic approach to reinforcement
learning which explicitly models the variability present in the learning
environment and the cost of observation. Criteria and rules for learning
success are quantitatively analyzed, and probabilities of exceeding the
observation cost bounds are also obtained.Comment: AIKE 201
Network Inference with Hidden Units
We derive learning rules for finding the connections between units in
stochastic dynamical networks from the recorded history of a ``visible'' subset
of the units. We consider two models. In both of them, the visible units are
binary and stochastic. In one model the ``hidden'' units are continuous-valued,
with sigmoidal activation functions, and in the other they are binary and
stochastic like the visible ones. We derive exact learning rules for both
cases. For the stochastic case, performing the exact calculation requires, in
general, repeated summations over an number of configurations that grows
exponentially with the size of the system and the data length, which is not
feasible for large systems. We derive a mean field theory, based on a
factorized ansatz for the distribution of hidden-unit states, which offers an
attractive alternative for large systems. We present the results of some
numerical calculations that illustrate key features of the two models and, for
the stochastic case, the exact and approximate calculations
MIHash: Online Hashing with Mutual Information
Learning-based hashing methods are widely used for nearest neighbor
retrieval, and recently, online hashing methods have demonstrated good
performance-complexity trade-offs by learning hash functions from streaming
data. In this paper, we first address a key challenge for online hashing: the
binary codes for indexed data must be recomputed to keep pace with updates to
the hash functions. We propose an efficient quality measure for hash functions,
based on an information-theoretic quantity, mutual information, and use it
successfully as a criterion to eliminate unnecessary hash table updates. Next,
we also show how to optimize the mutual information objective using stochastic
gradient descent. We thus develop a novel hashing method, MIHash, that can be
used in both online and batch settings. Experiments on image retrieval
benchmarks (including a 2.5M image dataset) confirm the effectiveness of our
formulation, both in reducing hash table recomputations and in learning
high-quality hash functions.Comment: International Conference on Computer Vision (ICCV), 201
Functional Bipartite Ranking: a Wavelet-Based Filtering Approach
It is the main goal of this article to address the bipartite ranking issue
from the perspective of functional data analysis (FDA). Given a training set of
independent realizations of a (possibly sampled) second-order random function
with a (locally) smooth autocorrelation structure and to which a binary label
is randomly assigned, the objective is to learn a scoring function s with
optimal ROC curve. Based on linear/nonlinear wavelet-based approximations, it
is shown how to select compact finite dimensional representations of the input
curves adaptively, in order to build accurate ranking rules, using recent
advances in the ranking problem for multivariate data with binary feedback.
Beyond theoretical considerations, the performance of the learning methods for
functional bipartite ranking proposed in this paper are illustrated by
numerical experiments
- …