7 research outputs found
Information theoretic approach to interactive learning
The principles of statistical mechanics and information theory play an
important role in learning and have inspired both theory and the design of
numerous machine learning algorithms. The new aspect in this paper is a focus
on integrating feedback from the learner. A quantitative approach to
interactive learning and adaptive behavior is proposed, integrating model- and
decision-making into one theoretical framework. This paper follows simple
principles by requiring that the observer's world model and action policy
should result in maximal predictive power at minimal complexity. Classes of
optimal action policies and of optimal models are derived from an objective
function that reflects this trade-off between prediction and complexity. The
resulting optimal models then summarize, at different levels of abstraction,
the process's causal organization in the presence of the learner's actions. A
fundamental consequence of the proposed principle is that the learner's optimal
action policies balance exploration and control as an emerging property.
Interestingly, the explorative component is present in the absence of policy
randomness, i.e. in the optimal deterministic behavior. This is a direct result
of requiring maximal predictive power in the presence of feedback.Comment: 6 page
Network information and connected correlations
Entropy and information provide natural measures of correlation among
elements in a network. We construct here the information theoretic analog of
connected correlation functions: irreducible --point correlation is measured
by a decrease in entropy for the joint distribution of variables relative
to the maximum entropy allowed by all the observed variable
distributions. We calculate the ``connected information'' terms for several
examples, and show that it also enables the decomposition of the information
that is carried by a population of elements about an outside source.Comment: 4 pages, 3 figure
Predictive Rate-Distortion for Infinite-Order Markov Processes
Predictive rate-distortion analysis suffers from the curse of dimensionality: clustering arbitrarily long pasts to retain information about arbitrarily long futures requires resources that typically grow exponentially with length. The challenge is compounded for infinite-order Markov processes, since conditioning on finite sequences cannot capture all of their past dependencies. Spectral arguments confirm a popular intuition: algorithms that cluster finite-length sequences fail dramatically when the underlying process has long-range temporal correlations and can fail even for processes generated by finite-memory hidden Markov models. We circumvent the curse of dimensionality in rate-distortion analysis of finite- and infinite-order processes by casting predictive rate-distortion objective functions in terms of the forward- and reverse-time causal states of computational mechanics. Examples demonstrate that the resulting algorithms yield substantial improvements