99 research outputs found
Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms
We present the first q-Gaussian smoothed functional (SF) estimator of the
Hessian and the first Newton-based stochastic optimization algorithm that
estimates both the Hessian and the gradient of the objective function using
q-Gaussian perturbations. Our algorithm requires only two system simulations
(regardless of the parameter dimension) and estimates both the gradient and the
Hessian at each update epoch using these. We also present a proof of
convergence of the proposed algorithm. In a related recent work (Ghoshdastidar
et al., 2013), we presented gradient SF algorithms based on the q-Gaussian
perturbations. Our work extends prior work on smoothed functional algorithms by
generalizing the class of perturbation distributions as most distributions
reported in the literature for which SF algorithms are known to work and turn
out to be special cases of the q-Gaussian distribution. Besides studying the
convergence properties of our algorithm analytically, we also show the results
of several numerical simulations on a model of a queuing network, that
illustrate the significance of the proposed method. In particular, we observe
that our algorithm performs better in most cases, over a wide range of
q-values, in comparison to Newton SF algorithms with the Gaussian (Bhatnagar,
2007) and Cauchy perturbations, as well as the gradient q-Gaussian SF
algorithms (Ghoshdastidar et al., 2013).Comment: This is a longer of version of the paper with the same title accepted
in Automatic
Deterministic Annealing: A Variant of Simulated Annealing and its Application to Fuzzy Clustering
Deterministic annealing (DA) is a deterministic variant of simulated annealing. In this chapter, after briefly introducing DA, we explain how DA is combined with the fuzzy c-means (FCM) clustering by employing the entropy maximization method, especially the Tsallis entropy maximization. The Tsallis entropy is a q parameter extension of the Shannon entropy. Then, we focus on Tsallis-entropy-maximized FCM (Tsallis-DAFCM), and examine effects of cooling functions for DA on accuracy and convergence. A shape of a membership function of Tsallis-DAFCM depends on both a system temperature and q. Accordingly, a relationship between the temperature and q is quantitatively investigated
Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions
Smoothed functional (SF) schemes for gradient estimation are known to be
efficient in stochastic optimization algorithms, specially when the objective
is to improve the performance of a stochastic system. However, the performance
of these methods depends on several parameters, such as the choice of a
suitable smoothing kernel. Different kernels have been studied in literature,
which include Gaussian, Cauchy and uniform distributions among others. This
paper studies a new class of kernels based on the q-Gaussian distribution, that
has gained popularity in statistical physics over the last decade. Though the
importance of this family of distributions is attributed to its ability to
generalize the Gaussian distribution, we observe that this class encompasses
almost all existing smoothing kernels. This motivates us to study SF schemes
for gradient estimation using the q-Gaussian distribution. Using the derived
gradient estimates, we propose two-timescale algorithms for optimization of a
stochastic objective function in a constrained setting with projected gradient
search approach. We prove the convergence of our algorithms to the set of
stationary points of an associated ODE. We also demonstrate their performance
numerically through simulations on a queuing model
Facticity as the amount of self-descriptive information in a data set
Using the theory of Kolmogorov complexity the notion of facticity {\phi}(x)
of a string is defined as the amount of self-descriptive information it
contains. It is proved that (under reasonable assumptions: the existence of an
empty machine and the availability of a faithful index) facticity is definite,
i.e. random strings have facticity 0 and for compressible strings 0 < {\phi}(x)
< 1/2 |x| + O(1). Consequently facticity measures the tension in a data set
between structural and ad-hoc information objectively. For binary strings there
is a so-called facticity threshold that is dependent on their entropy. Strings
with facticty above this threshold have no optimal stochastic model and are
essentially computational. The shape of the facticty versus entropy plot
coincides with the well-known sawtooth curves observed in complex systems. The
notion of factic processes is discussed. This approach overcomes problems with
earlier proposals to use two-part code to define the meaningfulness or
usefulness of a data set.Comment: 10 pages, 2 figure
Entropies from coarse-graining: convex polytopes vs. ellipsoids
We examine the Boltzmann/Gibbs/Shannon and the
non-additive Havrda-Charv\'{a}t / Dar\'{o}czy/Cressie-Read/Tsallis \
\ and the Kaniadakis -entropy \ \
from the viewpoint of coarse-graining, symplectic capacities and convexity. We
argue that the functional form of such entropies can be ascribed to a
discordance in phase-space coarse-graining between two generally different
approaches: the Euclidean/Riemannian metric one that reflects independence and
picks cubes as the fundamental cells and the symplectic/canonical one that
picks spheres/ellipsoids for this role. Our discussion is motivated by and
confined to the behaviour of Hamiltonian systems of many degrees of freedom. We
see that Dvoretzky's theorem provides asymptotic estimates for the minimal
dimension beyond which these two approaches are close to each other. We state
and speculate about the role that dualities may play in this viewpoint.Comment: 63 pages. No figures. Standard LaTe
- …