95 research outputs found
The error-bounded descriptional complexity of approximation networks
It is well known that artificial neural nets can be used as approximators of any continuous functions to any desired degree and therefore be used e.g. in high - speed, real-time process control. Nevertheless, for a given application and a given network architecture the non-trivial task remains to determine the necessary number of neurons and the necessary accuracy (number of bits) per weight for a satisfactory operation which are critical issues in VLSI and computer implementations of nontrivial tasks. In this paper the accuracy of the weights and the number of neurons are seen as general system parameters which determine the maximal approximation error by the absolute amount and the relative distribution of information contained in the network. We define as the error-bounded network descriptional complexity the minimal number of bits for a class of approximation networks which show a certain approximation error and achieve the conditions for this goal by the new principle of optimal information distribution. For two examples, a simple linear approximation of a non-linear, quadratic function and a non-linear approximation of the inverse kinematic transformation used in robot manipulator control, the principle of optimal information distribution gives the the optimal number of neurons and the resolutions of the variables, i.e. the minimal amount of storage for the neural net. Keywords: Kolmogorov complexity, e-Entropy, rate-distortion theory, approximation networks, information distribution, weight resolutions, Kohonen mapping, robot control
The performance of approximating ordinary differential equations by neural nets
The dynamics of many systems are described by ordinary differential equations (ODE). Solving ODEs with standard methods (i.e. numerical integration) needs a high amount of computing time but only a small amount of storage memory. For some applications, e.g. short time weather forecast or real time robot control, long computation times are prohibitive. Is there a method which uses less computing time (but has drawbacks in other aspects, e.g. memory), so that the computation of ODEs gets faster? We will try to discuss this question for the assumption that the alternative computation method is a neural network which was trained on ODE dynamics and compare both methods using the same approximation error. This comparison is done with two different errors. First, we use the standard error that measures the difference between the approximation and the solution of the ODE which is hard to characterize. But in many cases, as for physics engines used in computer games, the shape of the approximation curve is important and not the exact values of the approximation. Therefore, we introduce a subjective error based on the Total Least Square Error (TLSE) which gives more consistent results. For the final performance comparison, we calculate the optimal resource usage for the neural network and evaluate it depending on the resolution of the interpolation points and the inter-point distance. Our conclusion gives a method to evaluate where neural nets are advantageous over numerical ODE integration and where this is not the case. Index Terms—ODE, neural nets, Euler method, approximation complexity, storage optimization
Methods for Structural Pattern Recognition: Complexity and Applications
Katedra kybernetik
Coding-theorem Like Behaviour and Emergence of the Universal Distribution from Resource-bounded Algorithmic Probability
Previously referred to as `miraculous' in the scientific literature because
of its powerful properties and its wide application as optimal solution to the
problem of induction/inference, (approximations to) Algorithmic Probability
(AP) and the associated Universal Distribution are (or should be) of the
greatest importance in science. Here we investigate the emergence, the rates of
emergence and convergence, and the Coding-theorem like behaviour of AP in
Turing-subuniversal models of computation. We investigate empirical
distributions of computing models in the Chomsky hierarchy. We introduce
measures of algorithmic probability and algorithmic complexity based upon
resource-bounded computation, in contrast to previously thoroughly investigated
distributions produced from the output distribution of Turing machines. This
approach allows for numerical approximations to algorithmic
(Kolmogorov-Chaitin) complexity-based estimations at each of the levels of a
computational hierarchy. We demonstrate that all these estimations are
correlated in rank and that they converge both in rank and values as a function
of computational power, despite fundamental differences between computational
models. In the context of natural processes that operate below the Turing
universal level because of finite resources and physical degradation, the
investigation of natural biases stemming from algorithmic rules may shed light
on the distribution of outcomes. We show that up to 60\% of the
simplicity/complexity bias in distributions produced even by the weakest of the
computational models can be accounted for by Algorithmic Probability in its
approximation to the Universal Distribution.Comment: 27 pages main text, 39 pages including supplement. Online complexity
calculator: http://complexitycalculator.com
On similarity prediction and pairwise clustering
We consider the problem of clustering a finite set of items from pairwise similarity information. Unlike what is done in the literature on this subject, we do so in a passive learning setting, and with no specific constraints on the cluster shapes other than their size. We investigate the problem in different settings: i. an online setting, where we provide a tight characterization of the prediction complexity in the mistake bound model, and ii. a standard stochastic batch setting, where we give tight upper and lower bounds on the achievable generalization error. Prediction performance is measured both in terms of the ability to recover the similarity function encoding the hidden clustering and in terms of how well we classify each item within the set. The proposed algorithms are time efficient
Qualitative Reasoning on Complex Systems from Observations
A hybrid approach to phenomenological reconstruction of Complex
Systems (CS), using Formal Concept Analysis (FCA) as main tool for conceptual
data mining, is proposed. To illustrate the method, a classic CS is selected
(cellular automata), to show how FCA can assist to predict CS evolution under
different conceptual descriptions (from different observable features of the CS).Junta de Andalucía TIC-606
Asymmetry of the Kolmogorov complexity of online predicting odd and even bits
Symmetry of information states that .
We show that a similar relation for online Kolmogorov complexity does not hold.
Let the even (online Kolmogorov) complexity of an n-bitstring
be the length of a shortest program that computes on input ,
computes on input , etc; and similar for odd complexity. We
show that for all n there exist an n-bit x such that both odd and even
complexity are almost as large as the Kolmogorov complexity of the whole
string. Moreover, flipping odd and even bits to obtain a sequence
, decreases the sum of odd and even complexity to .Comment: 20 pages, 7 figure
SICA:subjectively interesting component analysis
The information in high-dimensional datasets is often too complex for human users to perceive directly. Hence, it may be helpful to use dimensionality reduction methods to construct lower dimensional representations that can be visualized. The natural question that arises is how do we construct a most informative low dimensional representation? We study this question from an information-theoretic perspective and introduce a new method for linear dimensionality reduction. The obtained model that quantifies the informativeness also allows us to flexibly account for prior knowledge a user may have about the data. This enables us to provide representations that are subjectively interesting. We title the method Subjectively Interesting Component Analysis (SICA) and expect it is mainly useful for iterative data mining. SICA is based on a model of a user’s belief state about the data. This belief state is used to search for surprising views. The initial state is chosen by the user (it may be empty up to the data format) and is updated automatically as the analysis progresses. We study several types of prior beliefs: if a user only knows the scale of the data, SICA yields the same cost function as Principal Component Analysis (PCA), while if a user expects the data to have outliers, we obtain a variant that we term t-PCA. Finally, scientifically more interesting variants are obtained when a user has more complicated beliefs, such as knowledge about similarities between data points. The experiments suggest that SICA enables users to find subjectively more interesting representations
- …