452 research outputs found
Multiple-Instance Learning: Radon-Nikodym Approach to Distribution Regression Problem
For distribution regression problem, where a bag of --observations is
mapped to a single value, a one--step solution is proposed. The problem of
random distribution to random value is transformed to random vector to random
value by taking distribution moments of observations in a bag as random
vector. Then Radon--Nikodym or least squares theory can be applied, what give
estimator. The probability distribution of is also obtained, what
requires solving generalized eigenvalues problem, matrix spectrum (not
depending on ) give possible outcomes and depending on probabilities
of outcomes can be obtained by projecting the distribution with fixed value
(delta--function) to corresponding eigenvector. A library providing numerically
stable polynomial basis for these calculations is available, what make the
proposed approach practical.Comment: Gramar fixes. Off by one error in eigenvalues problem fixe
Radon-Nikodym approximation in application to image analysis
For an image pixel information can be converted to the moments of some basis
, e.g. Fourier-Mellin, Zernike, monomials, etc. Given sufficient number of
moments pixel information can be completely recovered, for insufficient number
of moments only partial information can be recovered and the image
reconstruction is, at best, of interpolatory type. Standard approach is to
present interpolated value as a linear combination of basis functions, what is
equivalent to least squares expansion. However, recent progress in numerical
stability of moments estimation allows image information to be recovered from
moments in a completely different manner, applying Radon-Nikodym type of
expansion, what gives the result as a ratio of two quadratic forms of basis
functions. In contrast with least squares the Radon-Nikodym approach has
oscillation near the boundaries very much suppressed and does not diverge
outside of basis support. While least squares theory operate with vectors
, what make
the approach much more suitable to image transforms and statistical property
estimation.Comment: Images interpolated with d_x=d_y=100 are added to show the
practicality of high order moments calculatio
Evolution of magnetic field curvature in the Kulsrud-Anderson dynamo theory
We find that in the kinematic limit the ensemble averaged square of the
curvature of magnetic field lines is exponentially amplified in time by the
turbulent motions in a highly conductive plasma. At the same time, the ensemble
averaged curvature vector exponentially decays to zero. Thus, independently of
the initial conditions, the fluctuation field becomes very curved, and the
curvature vector becomes highly isotropic.
Keywords: ISM: magnetic fields, MHD, turbulence, methods: analyticalComment: 4 pages (ApJ twocolumn style), a part of the conclusion has been
change
Norm-Free Radon-Nikodym Approach to Machine Learning
For Machine Learning (ML) classification problem, where a vector of
--observations (values of attributes) is mapped to a single
value (class label), a generalized Radon--Nikodym type of solution is proposed.
Quantum--mechanics --like probability states are
considered and "Cluster Centers", corresponding to the extremums of
, are found from generalized
eigenvalues problem. The eigenvalues give possible outcomes and
corresponding to them eigenvectors define "Cluster
Centers". The projection of a state, localized at given to
classify, on these eigenvectors define the probability of outcome,
thus avoiding using a norm ( or other types), required for "quality
criteria" in a typical Machine Learning technique. A coverage of each `Cluster
Center" is calculated, what potentially allows to separate system properties
(described by outcomes) and system testing conditions (described by
coverage). As an example of such application distribution
estimator is proposed in a form of pairs , that can be
considered as Gauss quadratures generalization. This estimator allows to
perform probability distribution estimation in a strongly non--Gaussian
case.Comment: Cluster localization measure added. Quantum mechanics analogy
improved and expanded (density matrix exact expression added). Coverage
calculation via matrix spectrum adde
On Numerical Estimation of Joint Probability Distribution from Lebesgue Integral Quadratures
An important application of Lebesgue integral quadrature[1] is developed.
Given two random processes, and , two generalized eigenvalue
problems can be formulated and solved. In addition to obtaining two Lebesgue
quadratures (for and ) from two eigenproblems, the projections of --
and -- eigenvectors on each other allow to build a joint distribution
estimator, the most general form of which is a density--matrix correlation. The
examples of the density--matrix correlation can be the value--correlation
, similar to the regular correlation concept, and a new one, the
probability--correlation . The theory is implemented numerically;
the software is available under the GPLv3 license.Comment: Grammar fixes. Density matrix relation adde
Market Dynamics. On Supply and Demand Concepts
The disbalance of Supply and Demand is typically considered as the driving
force of the markets. However, the measurement or estimation of Supply and
Demand at price different from the execution price is not possible even after
the transaction. An approach in which Supply and Demand are always matched, but
the rate (number of units traded per unit time) of their matching
varies, is proposed. The state of the system is determined not by a price ,
but by a probability distribution defined as the square of a wavefunction
. The equilibrium state is postulated to be the one
giving maximal and obtained from maximizing the matching rate functional
, i.e. solving the dynamic equation of the form
"future price tend to the value maximizing the number of shares traded per unit
time". An application of the theory in a quasi--stationary case is
demonstrated. This transition from Supply and Demand concept to Liquidity
Deficit concept, described by the matching rate , allows to operate only
with observable variables, and have a theory applicable to practical problems
Multiple--Instance Learning: Christoffel Function Approach to Distribution Regression Problem
A two--step Christoffel function based solution is proposed to distribution
regression problem. On the first step, to model distribution of observations
inside a bag, build Christoffel function for each bag of observations. Then, on
the second step, build outcome variable Christoffel function, but use the bag's
Christoffel function value at given point as the weight for the bag's outcome.
The approach allows the result to be obtained in closed form and then to be
evaluated numerically. While most of existing approaches minimize some kind an
error between outcome and prediction, the proposed approach is conceptually
different, because it uses Christoffel function for knowledge representation,
what is conceptually equivalent working with probabilities only. To receive
possible outcomes and their probabilities Gauss quadrature for second--step
measure can be built, then the nodes give possible outcomes and normalized
weights -- outcome probabilities. A library providing numerically stable
polynomial basis for these calculations is available, what make the proposed
approach practical
The power of choice combined with preferential attachment
We prove almost sure convergence of the maximum degree in an evolving tree
model combining local choice and preferential attachment. At each step in the
growth of the graph, a new vertex is introduced. A fixed, finite number of
possible neighbors are sampled from the existing vertices with probability
proportional to degree. Of these possibilities, the vertex with the largest
degree is chosen. The maximal degree in this model has linear or near-linear
behavior. This contrasts sharply with what is seen in the same choice model
without preferential attachment. The proof is based showing the tree has a
persistent hub by comparison with the standard preferential attachment model,
as well as martingale and random walk arguments
On Lebesgue Integral Quadrature
A new type of quadrature is developed. The Gaussian quadrature, for a given
measure, finds optimal values of a function's argument (nodes) and the
corresponding weights. In contrast, the Lebesgue quadrature developed in this
paper, finds optimal values of function (value-nodes) and the corresponding
weights. The Gaussian quadrature groups sums by function argument; it can be
viewed as a -point discrete measure, producing the Riemann integral. The
Lebesgue quadrature groups sums by function value; it can be viewed as a
-point discrete distribution, producing the Lebesgue integral.
Mathematically, the problem is reduced to a generalized eigenvalue problem:
Lebesgue quadrature value-nodes are the eigenvalues and the corresponding
weights are the square of the averaged eigenvectors. A numerical estimation of
an integral as the Lebesgue integral is especially advantageous when analyzing
irregular and stochastic processes. The approach separates the outcome
(value-nodes) and the probability of the outcome (weight). For this reason, it
is especially well-suited for the study of non-Gaussian processes. The software
implementing the theory is available from the authors.Comment: Relation to density matrix added. Images fixed. Density matrix
appendix fixed. Christoffel function spectrum is added to Appendix B.
Numerical examples of the Christoffel weights are added. The optimal
clustering solution is added to Appendix C. Notation changes according to
arXiv:1906.00460 . Software new version; description updat
The power of 2 choices over preferential attachment
We introduce a new type of preferential attachment tree that includes choices
in its evolution, like with Achlioptas processes. At each step in the growth of
the graph, a new vertex is introduced. Two possible neighbor vertices are
selected independently and with probability proportional to degree. Between the
two, the vertex with smaller degree is chosen, and a new edge is created. We
determine with high probability the largest degree of this graph up to some
additive error term
- …